Competitive Markov decision processes with partial observation

We study a class of Markov decision processes (MDPs) in the infinite time horizon where the number of controllers is two and the observation information is allowed to be imperfect. Suppose the system, space and action space are both finite, and the controllers, having conflicting interests with each...

Full description

Saved in:
Bibliographic Details
Published in2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583) Vol. 1; pp. 236 - 241 vol.1
Main Authors Shun-Pin Hsu, Arapostathis, A.
Format Conference Proceeding
LanguageEnglish
Published Piscataway NJ IEEE 2004
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We study a class of Markov decision processes (MDPs) in the infinite time horizon where the number of controllers is two and the observation information is allowed to be imperfect. Suppose the system, space and action space are both finite, and the controllers, having conflicting interests with each other, make decisions independently to seek their own best long-run average profit. Under the hypothesis that at least one system state is perfectly observable and accessible (by each system state no matter what actions are taken), we prove the existence of optimal policies for both controllers and characterize them by the min-max type of dynamic programming equations. An example on a class of machine maintenance process is presented to show our work
ISBN:0780385667
9780780385665
ISSN:1062-922X
2577-1655
DOI:10.1109/ICSMC.2004.1398303