Provable Representation with Efficient Planning for Partial Observable Reinforcement Learning
In most real-world reinforcement learning applications, state information is only partially observable, which breaks the Markov decision process assumption and leads to inferior performance for algorithms that conflate observations with state. Partially Observable Markov Decision Processes (POMDPs),...
Saved in:
Main Authors | , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
20.11.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In most real-world reinforcement learning applications, state information is
only partially observable, which breaks the Markov decision process assumption
and leads to inferior performance for algorithms that conflate observations
with state. Partially Observable Markov Decision Processes (POMDPs), on the
other hand, provide a general framework that allows for partial observability
to be accounted for in learning, exploration and planning, but presents
significant computational and statistical challenges. To address these
difficulties, we develop a representation-based perspective that leads to a
coherent framework and tractable algorithmic approach for practical
reinforcement learning from partial observations. We provide a theoretical
analysis for justifying the statistical efficiency of the proposed algorithm,
and also empirically demonstrate the proposed algorithm can surpass
state-of-the-art performance with partial observations across various
benchmarks, advancing reliable reinforcement learning towards more practical
applications. |
---|---|
DOI: | 10.48550/arxiv.2311.12244 |