GENERATIVE NEURAL NETWORK SYSTEMS FOR GENERATING INSTRUCTION SEQUENCES TO CONTROL AN AGENT PERFORMING A TASK

A generative adversarial neural network system to provide a sequence of actions for performing a task. The system comprises a reinforcement learning neural network subsystem coupled to a simulator and a discriminator neural network. The reinforcement learning neural network subsystem includes a poli...

Full description

Saved in:
Bibliographic Details
Main Authors Vinyals, Oriol, Eslami, Seyed Mohammadali, Ganin, Iaroslav, Kulkarni, Tejas Dattatraya
Format Patent
LanguageEnglish
Published 02.09.2021
Subjects
Online AccessGet full text

Cover

Loading…
Abstract A generative adversarial neural network system to provide a sequence of actions for performing a task. The system comprises a reinforcement learning neural network subsystem coupled to a simulator and a discriminator neural network. The reinforcement learning neural network subsystem includes a policy recurrent neural network to, at each of a sequence of time steps, select one or more actions to be performed according to an action selection policy, each action comprising one or more control commands for a simulator. The simulator is configured to implement the control commands for the time steps to generate a simulator output. The discriminator neural network is configured to discriminate between the simulator output and training data, to provide a reward signal for the reinforcement learning. The simulator may be non-differentiable simulator, for example a computer program to produce an image or audio waveform or a program to control a robot or vehicle.
AbstractList A generative adversarial neural network system to provide a sequence of actions for performing a task. The system comprises a reinforcement learning neural network subsystem coupled to a simulator and a discriminator neural network. The reinforcement learning neural network subsystem includes a policy recurrent neural network to, at each of a sequence of time steps, select one or more actions to be performed according to an action selection policy, each action comprising one or more control commands for a simulator. The simulator is configured to implement the control commands for the time steps to generate a simulator output. The discriminator neural network is configured to discriminate between the simulator output and training data, to provide a reward signal for the reinforcement learning. The simulator may be non-differentiable simulator, for example a computer program to produce an image or audio waveform or a program to control a robot or vehicle.
Author Vinyals, Oriol
Ganin, Iaroslav
Kulkarni, Tejas Dattatraya
Eslami, Seyed Mohammadali
Author_xml – fullname: Vinyals, Oriol
– fullname: Eslami, Seyed Mohammadali
– fullname: Ganin, Iaroslav
– fullname: Kulkarni, Tejas Dattatraya
BookMark eNqNy7sKwkAQQNEttPD1DwPWgongo1yWSQyJs7ozq1iFIGsVkkD8f4ygvdVtzp2qUdM2YaLqFAmdluyKQOidLobIzboc-M6CJ4bEOvgpSiEjFueNZJaA8eKRDDKIBWNJnC1AE-jBC5zRDe_pM2kQzflcjZ9V3YfFtzO1TFDMcRW6tgx9Vz1CE16l53gdR_EuOmz3Otr8p96Vsjns
ContentType Patent
DBID EVB
DatabaseName esp@cenet
DatabaseTitleList
Database_xml – sequence: 1
  dbid: EVB
  name: esp@cenet
  url: http://worldwide.espacenet.com/singleLineSearch?locale=en_EP
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
Discipline Medicine
Chemistry
Sciences
Physics
ExternalDocumentID US2021271968A1
GroupedDBID EVB
ID FETCH-epo_espacenet_US2021271968A13
IEDL.DBID EVB
IngestDate Fri Jul 19 13:57:45 EDT 2024
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-epo_espacenet_US2021271968A13
Notes Application Number: US201916967597
OpenAccessLink https://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20210902&DB=EPODOC&CC=US&NR=2021271968A1
ParticipantIDs epo_espacenet_US2021271968A1
PublicationCentury 2000
PublicationDate 20210902
PublicationDateYYYYMMDD 2021-09-02
PublicationDate_xml – month: 09
  year: 2021
  text: 20210902
  day: 02
PublicationDecade 2020
PublicationYear 2021
RelatedCompanies DeepMind Technologies Limited
RelatedCompanies_xml – name: DeepMind Technologies Limited
Score 3.356601
Snippet A generative adversarial neural network system to provide a sequence of actions for performing a task. The system comprises a reinforcement learning neural...
SourceID epo
SourceType Open Access Repository
SubjectTerms CALCULATING
COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
COMPUTING
COUNTING
ELECTRIC DIGITAL DATA PROCESSING
HANDLING RECORD CARRIERS
PHYSICS
PRESENTATION OF DATA
RECOGNITION OF DATA
RECORD CARRIERS
Title GENERATIVE NEURAL NETWORK SYSTEMS FOR GENERATING INSTRUCTION SEQUENCES TO CONTROL AN AGENT PERFORMING A TASK
URI https://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20210902&DB=EPODOC&locale=&CC=US&NR=2021271968A1
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3dT8IwEL8Q_HxT1PiBpolmb4tSB4MHYsZWRGEbrh3CE9nGSEzIIDLjv--1gvLEU9Ne26RNr9df-7srwF1M8ZBAJzW9ElUi3ZgiQJHRT_WqaSS4oqQJld7IrlfrhMbrsDoswGztC6PihH6r4IioUQnqe67268X_JZajuJXL-_gDi-ZPbdF0tBU6popmqDmtJuv7jm9rtt0MueYFSkZNXG51C7HSDh6kTUkAY4OW9EtZbBqV9hHs9rG_LD-GQpqV4MBe_71Wgn139eRdgj3F0UyWWLjSw-UJzH4ZZ-JlwIjHwsDqYSLe_aBL-IgL5nKC6I6sa3nPBEG6CELFGCGcvYXyZokT4RPb90Tg94jlEQvrC9JnAbZ1ZSOLCIt3T-G2zYTd0XEA47_5God8c7SPZ1DM5ll6DmRiJuk0eaATGleNuE4b00raaMRJakYR6mntAsrberrcLr6CQ5lVBCxahmL--ZVeo8XO4xs10T_z9ZCD
link.rule.ids 230,309,783,888,25578,76884
linkProvider European Patent Office
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3dT8IwEL8Q_MA3RY0fqE00e1uUOhg8EDO2Isg-cOsQn8g2RmJCBpEZ_32vFZQnnpr0rk3a9Hr9tb-7AtzFFA8JdFJXq1E1UrUpAhSR_VSt6VqCK0q4UBGN7Lj1bqi9jGqjAszWsTAyT-i3TI6IFpWgvedyv178X2JZklu5vI8_sGr-1OEtS1mhYypphorVbrGBZ3mmYpqtMFBcX8qojsutYSBW2sFDdkNk2mfDtohLWWw6lc4h7A6wvyw_gkKalaFkrv9eK8O-s3ryLsOe5GgmS6xc2eHyGGa_jDPeGzListA3bCz4m-f3SfAecOYEBNEdWWu5zwRBOvdDyRghAXsNxc1SQLhHTM_lvmcTwyUG6nMyYD62dUQjg3Aj6J_AbYdxs6viAMZ_8zUOg83RPp5CMZtn6RmQiZ6k0-SBTmhc0-IGbU6rabMZJ6keRWin9XOobOvpYrv4Bkpd7thju-f2L-FAiCQZi1agmH9-pVfovfP4Wk76D-ROk3M
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Apatent&rft.title=GENERATIVE+NEURAL+NETWORK+SYSTEMS+FOR+GENERATING+INSTRUCTION+SEQUENCES+TO+CONTROL+AN+AGENT+PERFORMING+A+TASK&rft.inventor=Vinyals%2C+Oriol&rft.inventor=Eslami%2C+Seyed+Mohammadali&rft.inventor=Ganin%2C+Iaroslav&rft.inventor=Kulkarni%2C+Tejas+Dattatraya&rft.date=2021-09-02&rft.externalDBID=A1&rft.externalDocID=US2021271968A1