Utility-Based Reinforcement Learning: Unifying Single-objective and Multi-objective Reinforcement Learning
Research in multi-objective reinforcement learning (MORL) has introduced the utility-based paradigm, which makes use of both environmental rewards and a function that defines the utility derived by the user from those rewards. In this paper we extend this paradigm to the context of single-objective...
Saved in:
Main Authors | , , , , , , , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
04.02.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Research in multi-objective reinforcement learning (MORL) has introduced the
utility-based paradigm, which makes use of both environmental rewards and a
function that defines the utility derived by the user from those rewards. In
this paper we extend this paradigm to the context of single-objective
reinforcement learning (RL), and outline multiple potential benefits including
the ability to perform multi-policy learning across tasks relating to uncertain
objectives, risk-aware RL, discounting, and safe RL. We also examine the
algorithmic implications of adopting a utility-based approach. |
---|---|
DOI: | 10.48550/arxiv.2402.02665 |