Beyond Strong-Cyclic: Doing Your Best in Stochastic Environments

Beyond Strong-Cyclic: Doing Your Best in Stochastic Environments

Benjamin Aminof, Giuseppe De Giacomo, Sasha Rubin, Florian Zuleger

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence

``Strong-cyclic policies" were introduced to formalize trial-and-error strategies and are known to work in Markovian stochastic domains, i.e., they guarantee that the goal is reached with probability 1. We introduce ``best-effort" policies for (not necessarily Markovian) stochastic domains. These generalize strong-cyclic policies by taking advantage of stochasticity even if the goal cannot be reached with probability 1. We compare such policies with optimal policies, i.e., policies that maximize the probability that the goal is achieved, and show that optimal policies are best-effort, but that the converse is false in general. With this framework at hand, we revisit the foundational problem of what it means to plan in nondeterministic domains when the nondeterminism has a stochastic nature. We show that one can view a nondeterministic planning domain as a representation of infinitely many stochastic domains with the same support but different probabilities, and that for temporally extended goals expressed in LTL/LTLf a finite-state best-effort policy in one of these domains is best-effort in each of the domains. In particular, this gives an approach for finding such policies that reduces to solving finite-state MDPs with LTL/LTLf goals. All this shows that ``best-effort" policies are robust to changes in the probabilities, as long as the support is unchanged.
Keywords:
Knowledge Representation and Reasoning: Reasoning about actions
Planning and Scheduling: Theoretical Foundations of Planning