Risk-Averse Planning Under Uncertainty
We consider the problem of designing policies for partially observable Markov decision processes (POMDPs) with dynamic coherent risk objectives. Synthesizing risk-averse optimal policies for POMDPs requires infinite memory and thus undecidable. To overcome this difficulty, we propose a method based...
Saved in:
Published in | 2020 American Control Conference (ACC) pp. 3305 - 3312 |
---|---|
Main Authors | , , , , |
Format | Conference Proceeding |
Language | English |
Published |
AACC
01.07.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | We consider the problem of designing policies for partially observable Markov decision processes (POMDPs) with dynamic coherent risk objectives. Synthesizing risk-averse optimal policies for POMDPs requires infinite memory and thus undecidable. To overcome this difficulty, we propose a method based on bounded policy iteration for designing stochastic but finite state (memory) controllers, which takes advantage of standard convex optimization methods. Given a memory budget and optimality criterion, the proposed method modifies the stochastic finite state controller leading to sub-optimal solutions with lower coherent risk. |
---|---|
ISSN: | 2378-5861 |
DOI: | 10.23919/ACC45564.2020.9147792 |