Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning

To rapidly learn a new task, it is often essential for agents to explore efficiently -- especially when performance matters from the first timestep. One way to learn such behaviour is via meta-learning. Many existing methods however rely on dense rewards for meta-training, and can fail catastrophica...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Zintgraf, Luisa, Feng, Leo, Lu, Cong, Igl, Maximilian, Hartikainen, Kristian, Hofmann, Katja, Whiteson, Shimon
Format	Paper Journal Article
Language	English
Published	Ithaca Cornell University Library, arXiv.org 09.06.2021
Subjects	Computer Science - Artificial Intelligence Computer Science - Learning Exploration Learning Statistics - Machine Learning Teaching methods Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	To rapidly learn a new task, it is often essential for agents to explore efficiently -- especially when performance matters from the first timestep. One way to learn such behaviour is via meta-learning. Many existing methods however rely on dense rewards for meta-training, and can fail catastrophically if the rewards are sparse. Without a suitable reward signal, the need for exploration during meta-training is exacerbated. To address this, we propose HyperX, which uses novel reward bonuses for meta-training to explore in approximate hyper-state space (where hyper-states represent the environment state and the agent's task belief). We show empirically that HyperX meta-learns better task-exploration and adapts more successfully to new tasks than existing methods.
ISSN:	2331-8422
DOI:	10.48550/arxiv.2010.01062