GENERALIZED REINFORCEMENT LEARNING AGENT

An apparatus has a memory storing a reinforcement learning policy with an optimization component and a data collection component. The apparatus has a regularization component which applies regularization selectively between the optimization component of the reinforcement learning policy and the data...

Full description

Saved in:
Bibliographic Details
Main Authors ZHANG, Cheng, HOFMANN, Katja, TSCHIATSCHEK, Sebastian, IGL, Maximilian, CIOSEK, Kamil Andrzej, LI, Yingzhen, DEVLIN, Sam Michael
Format Patent
LanguageEnglish
Published 13.04.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:An apparatus has a memory storing a reinforcement learning policy with an optimization component and a data collection component. The apparatus has a regularization component which applies regularization selectively between the optimization component of the reinforcement learning policy and the data collection component of the reinforcement learning policy. A processor carries out a reinforcement learning process by: triggering execution of an agent according to the policy and with respect to a first task; observing values of variables comprising: an observation space of the agent, an action of the agent; and updating the policy using reinforcement learning according to the observed values and taking into account the regularization.
Bibliography:Application Number: US202218052202