POMDPs under probabilistic semantics

We consider partially observable Markov decision processes (POMDPs) with limit-average payoff, where a reward value in the interval [0,1] is associated with every transition, and the payoff of an infinite path is the long-run average of the rewards. We consider two types of path constraints: (i) a q...

Full description

Saved in:

Bibliographic Details
Published in	Artificial intelligence Vol. 221; pp. 46 - 72
Main Authors	Chatterjee, Krishnendu, Chmelík, Martin
Format	Journal Article
Language	English
Published	Elsevier B.V 01.04.2015
Subjects	Algorithms Almost-sure winning Artificial intelligence Expert systems Intervals Limit-average objectives Markov processes POMDPs Probabilistic methods Probability theory Semantics Thresholds POMDPs Limit-average objectives Almost-sure winning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	We consider partially observable Markov decision processes (POMDPs) with limit-average payoff, where a reward value in the interval [0,1] is associated with every transition, and the payoff of an infinite path is the long-run average of the rewards. We consider two types of path constraints: (i) a quantitative constraint defines the set of paths where the payoff is at least a given threshold λ1∈(0,1]; and (ii) a qualitative constraint which is a special case of the quantitative constraint with λ1=1. We consider the computation of the almost-sure winning set, where the controller needs to ensure that the path constraint is satisfied with probability 1. Our main results for qualitative path constraints are as follows: (i) the problem of deciding the existence of a finite-memory controller is EXPTIME-complete; and (ii) the problem of deciding the existence of an infinite-memory controller is undecidable. For quantitative path constraints we show that the problem of deciding the existence of a finite-memory controller is undecidable. We also present a prototype implementation of our EXPTIME algorithm and experimental results on several examples.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0004-3702 1872-7921
DOI:	10.1016/j.artint.2014.12.009