POMDPs under probabilistic semantics
We consider partially observable Markov decision processes (POMDPs) with limit-average payoff, where a reward value in the interval [0,1] is associated with every transition, and the payoff of an infinite path is the long-run average of the rewards. We consider two types of path constraints: (i) a q...
Saved in:
Published in | Artificial intelligence Vol. 221; pp. 46 - 72 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
01.04.2015
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Be the first to leave a comment!