Learning motor primitives and training a machine learning system using a linear-feedback-stabilized policy

A computer-implemented method of training a student machine learning system comprises receiving data indicating execution of an expert, determining one or more actions performed by the expert during the execution and a corresponding state-action Jacobian, and training the student machine learning sy...

Full description

Saved in:
Bibliographic Details
Main Authors Merel, Joshua, Hasenclever, Leonard, Galashov, Alexandre, Pham, Vu
Format Patent
LanguageEnglish
Published 01.08.2023
Subjects
Online AccessGet full text

Cover

Loading…
Abstract A computer-implemented method of training a student machine learning system comprises receiving data indicating execution of an expert, determining one or more actions performed by the expert during the execution and a corresponding state-action Jacobian, and training the student machine learning system using a linear-feedback-stabilized policy. The linear-feedback-stabilized policy may be based on the state-action Jacobian. Also a neural network system for representing a space of probabilistic motor primitives, implemented by one or more computers. The neural network system comprises an encoder configured to generate latent variables based on a plurality of inputs, each input comprising a plurality of frames, and a decoder configured to generate an action based on one or more of the latent variables and a state.
AbstractList A computer-implemented method of training a student machine learning system comprises receiving data indicating execution of an expert, determining one or more actions performed by the expert during the execution and a corresponding state-action Jacobian, and training the student machine learning system using a linear-feedback-stabilized policy. The linear-feedback-stabilized policy may be based on the state-action Jacobian. Also a neural network system for representing a space of probabilistic motor primitives, implemented by one or more computers. The neural network system comprises an encoder configured to generate latent variables based on a plurality of inputs, each input comprising a plurality of frames, and a decoder configured to generate an action based on one or more of the latent variables and a state.
Author Pham, Vu
Merel, Joshua
Galashov, Alexandre
Hasenclever, Leonard
Author_xml – fullname: Merel, Joshua
– fullname: Hasenclever, Leonard
– fullname: Galashov, Alexandre
– fullname: Pham, Vu
BookMark eNqNy7sSAUEQheEJCNzeoT3ABouiNqUogQzxVu9sL83canqoWk9PueSiE3zn76uO84566rIjjI7dCaxPPkKIbDnxnQTQ1ZAi8lsRLOozOwLzC6SVRBZu8nHzQoxZQ1RXqK-ZJKzY8INqCN6wboeq26ARGn13oMab9WG1zSj4kiSgJkepPO7zfJHPimK-nEz_-TwBLF9Dcg
ContentType Patent
DBID EVB
DatabaseName esp@cenet
DatabaseTitleList
Database_xml – sequence: 1
  dbid: EVB
  name: esp@cenet
  url: http://worldwide.espacenet.com/singleLineSearch?locale=en_EP
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
Discipline Medicine
Chemistry
Sciences
Physics
ExternalDocumentID US11714996B2
GroupedDBID EVB
ID FETCH-epo_espacenet_US11714996B23
IEDL.DBID EVB
IngestDate Fri Jul 19 14:30:01 EDT 2024
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-epo_espacenet_US11714996B23
Notes Application Number: US202217872308
OpenAccessLink https://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20230801&DB=EPODOC&CC=US&NR=11714996B2
ParticipantIDs epo_espacenet_US11714996B2
PublicationCentury 2000
PublicationDate 20230801
PublicationDateYYYYMMDD 2023-08-01
PublicationDate_xml – month: 08
  year: 2023
  text: 20230801
  day: 01
PublicationDecade 2020
PublicationYear 2023
RelatedCompanies DeepMind Technologies Limited
RelatedCompanies_xml – name: DeepMind Technologies Limited
Score 3.4851062
Snippet A computer-implemented method of training a student machine learning system comprises receiving data indicating execution of an expert, determining one or more...
SourceID epo
SourceType Open Access Repository
SubjectTerms CALCULATING
COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
COMPUTING
COUNTING
ELECTRIC DIGITAL DATA PROCESSING
PHYSICS
Title Learning motor primitives and training a machine learning system using a linear-feedback-stabilized policy
URI https://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20230801&DB=EPODOC&locale=&CC=US&NR=11714996B2
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LS8NAEB5Kfd60KlofrCC5Lbq0TdNDEJq0FKEPbCu9lX1FrJqGJiL4653dJtaLXnfJshmYb75svv0G4KaFwCsjzqinuUvrHpPUc2uatqJIeKrREEqY847-wO1N6w-zxqwEi-IujPUJ_bTmiJhREvM9s3idbA6xQqutTG_FCw4t77sTP3Tyr2Pk04i4Ttj2O6NhOAycIPCnY2fw6DPT6Bu5fRvhegtpdNNkQ-epbW6lJL9LSvcAtke4WpwdQknHFdgLis5rFdjt5z-8K7BjFZoyxcE8C9MjWOSmqM8E47xckcS05jKwlRIeK1J0fSCcvFuppCZvxQNr42Zi1O5m3nBMvqIRljDB5StFqmjEsl9akcQaBh_DdbczCXoUtz__idV8Ot68ae0EyvEy1qdAmJIt5kl2x5GiRZpzLVylkM14rCmVrJ1B9e91qv9NnsO-iftaE3cB5Wz1oS-xTmfiygb4G9y1mzE
link.rule.ids 230,309,783,888,25576,76876
linkProvider European Patent Office
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LT8JAEJ4QfOBNUaP4WhPTW6NNoZRDY0ILQeUVAcON7KtG1NLQGhN_vbNLK170uptu2mnmm6_bb78BuGog8PKQWqYrqWNWXYubrmNLsxGGzBW1GhNM7Xf0-k5nUr2f1qYFmOdnYbRP6Kc2R8SM4pjvqcbreL2JFWhtZXLNXnBocdsee4GRfR0jn0bENYKm1xoOgoFv-L43GRn9R89Sjb6R2zcRrjeQYtdVNrSemupUSvy7pLR3YXOIq0XpHhRkVIaSn3deK8N2L_vhXYYtrdDkCQ5mWZjswzwzRX0mGOfFksSqNZeCrYTQSJC86wOh5F1LJSV5yy9YGTcTpXZX84pj0qUZYgljlL-aSBWVWPZLChJrw-ADuGy3xn7HxNuf_cRqNhmtn9Q-hGK0iOQREEvwhuVy64YiRQslpZI5QiCbca06F9w-hsrf61T-m7yAUmfc6866d_2HE9hR72CljzuFYrr8kGdYs1N2roP9DVFwniQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Apatent&rft.title=Learning+motor+primitives+and+training+a+machine+learning+system+using+a+linear-feedback-stabilized+policy&rft.inventor=Merel%2C+Joshua&rft.inventor=Hasenclever%2C+Leonard&rft.inventor=Galashov%2C+Alexandre&rft.inventor=Pham%2C+Vu&rft.date=2023-08-01&rft.externalDBID=B2&rft.externalDocID=US11714996B2