Utility-Based Reinforcement Learning: Unifying Single-objective and Multi-objective Reinforcement Learning

Research in multi-objective reinforcement learning (MORL) has introduced the utility-based paradigm, which makes use of both environmental rewards and a function that defines the utility derived by the user from those rewards. In this paper we extend this paradigm to the context of single-objective...

Full description

Saved in:
Bibliographic Details
Main Authors Vamplew, Peter, Foale, Cameron, Hayes, Conor F, Mannion, Patrick, Howley, Enda, Dazeley, Richard, Johnson, Scott, Källström, Johan, Ramos, Gabriel, Rădulescu, Roxana, Röpke, Willem, Roijers, Diederik M
Format Journal Article
LanguageEnglish
Published 04.02.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Research in multi-objective reinforcement learning (MORL) has introduced the utility-based paradigm, which makes use of both environmental rewards and a function that defines the utility derived by the user from those rewards. In this paper we extend this paradigm to the context of single-objective reinforcement learning (RL), and outline multiple potential benefits including the ability to perform multi-policy learning across tasks relating to uncertain objectives, risk-aware RL, discounting, and safe RL. We also examine the algorithmic implications of adopting a utility-based approach.
DOI:10.48550/arxiv.2402.02665