REINFORCEMENT LEARNING FOR USER BEHAVIOUR

A computer-implemented method comprising: using one or more synthetic user models to train a particular reinforcement learning agent, each of the one or more synthetic user models comprising a behaviour function and a response generation function, wherein a reinforcement learning agent is configured...

Full description

Saved in:

Bibliographic Details
Main Author	BAYKANER, Khan
Format	Patent
Language	English French German
Published	11.12.2019
Subjects	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING PHYSICS
Online Access	Get full text

Cover

Loading…

More Information
Summary:	A computer-implemented method comprising: using one or more synthetic user models to train a particular reinforcement learning agent, each of the one or more synthetic user models comprising a behaviour function and a response generation function, wherein a reinforcement learning agent is configured to produce an output action based on an input state to the reinforcement learning agent, a behaviour function is configured to model a response to different output actions produced by a reinforcement learning agent, and a response generation function is configured to use a behaviour function to generate a response to an output action produced by a reinforcement learning agent, and wherein the training comprises, for each of the one or more synthetic user models: the particular reinforcement learning agent producing an output action based on a current input state; the response generation function of the synthetic user model using the behaviour function of the synthetic user model to generate a response to the output action; appropriately updating the particular reinforcement learning agent based on the response; and iteratively repeating the producing an output action, generating a response and updating the particular reinforcement learning agent steps, wherein for each subsequent iteration the particular reinforcement learning agent takes as an input state the response generation function's response from the previous iteration in order to determine the output action to produce; and providing the particular trained reinforcement learning agent as an output for use by a user application.
Bibliography:	Application Number: EP20180175959