Competitive Markov decision processes with partial observation

We study a class of Markov decision processes (MDPs) in the infinite time horizon where the number of controllers is two and the observation information is allowed to be imperfect. Suppose the system, space and action space are both finite, and the controllers, having conflicting interests with each...

Full description

Saved in:

Bibliographic Details
Published in	2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583) Vol. 1; pp. 236 - 241 vol.1
Main Authors	Shun-Pin Hsu, Arapostathis, A.
Format	Conference Proceeding
Language	English
Published	Piscataway NJ IEEE 2004
Subjects	Airplanes Applied sciences Computer networks Computer science; control theory; systems Control systems Control theory. Systems Cost function Equations Exact sciences and technology Nash equilibrium Optimal control Processor scheduling Stochastic processes Stochastic systems Markov process Infinite time Markov decision Optimal control Minimax method Optimal policy Defect Infinite horizon Maintenance Dynamic programming Modeling Observable
Online Access	Get full text

Cover

Loading…

More Information
Summary:	We study a class of Markov decision processes (MDPs) in the infinite time horizon where the number of controllers is two and the observation information is allowed to be imperfect. Suppose the system, space and action space are both finite, and the controllers, having conflicting interests with each other, make decisions independently to seek their own best long-run average profit. Under the hypothesis that at least one system state is perfectly observable and accessible (by each system state no matter what actions are taken), we prove the existence of optimal policies for both controllers and characterize them by the min-max type of dynamic programming equations. An example on a class of machine maintenance process is presented to show our work
ISBN:	0780385667 9780780385665
ISSN:	1062-922X 2577-1655
DOI:	10.1109/ICSMC.2004.1398303