Risk-Averse Planning Under Uncertainty

We consider the problem of designing policies for partially observable Markov decision processes (POMDPs) with dynamic coherent risk objectives. Synthesizing risk-averse optimal policies for POMDPs requires infinite memory and thus undecidable. To overcome this difficulty, we propose a method based...

Full description

Saved in:

Bibliographic Details
Main Authors	Ahmadi, Mohamadreza, Ono, Masahiro, Ingham, Michel D, Murray, Richard M, Ames, Aaron D
Format	Journal Article
Language	English
Published	27.09.2019
Subjects	Computer Science - Artificial Intelligence Computer Science - Robotics Mathematics - Optimization and Control
Online Access	Get full text

Cover

Loading…

More Information
Summary:	We consider the problem of designing policies for partially observable Markov decision processes (POMDPs) with dynamic coherent risk objectives. Synthesizing risk-averse optimal policies for POMDPs requires infinite memory and thus undecidable. To overcome this difficulty, we propose a method based on bounded policy iteration for designing stochastic but finite state (memory) controllers, which takes advantage of standard convex optimization methods. Given a memory budget and optimality criterion, the proposed method modifies the stochastic finite state controller leading to sub-optimal solutions with lower coherent risk.
DOI:	10.48550/arxiv.1909.12499