GENERALIZED REINFORCEMENT LEARNING AGENT

An apparatus has a memory storing a reinforcement learning policy with an optimization component and a data collection component. The apparatus has a regularization component which applies regularization selectively between the optimization component of the reinforcement learning policy and the data...

Full description

Saved in:

Bibliographic Details
Main Authors	ZHANG, Cheng, HOFMANN, Katja, TSCHIATSCHEK, Sebastian, IGL, Maximilian, CIOSEK, Kamil Andrzej, LI, Yingzhen, DEVLIN, Sam Michael
Format	Patent
Language	English
Published	13.04.2023
Subjects	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING PHYSICS
Online Access	Get full text

Cover

Loading…

More Information
Summary:	An apparatus has a memory storing a reinforcement learning policy with an optimization component and a data collection component. The apparatus has a regularization component which applies regularization selectively between the optimization component of the reinforcement learning policy and the data collection component of the reinforcement learning policy. A processor carries out a reinforcement learning process by: triggering execution of an agent according to the policy and with respect to a first task; observing values of variables comprising: an observation space of the agent, an action of the agent; and updating the policy using reinforcement learning according to the observed values and taking into account the regularization.
Bibliography:	Application Number: US202218052202