Combining Learning-based Locomotion Policy with Model-based Manipulation for Legged Mobile Manipulators

Deep reinforcement learning produces robust locomotion policies for legged robots over challenging terrains. To date, few studies have leveraged model-based methods to combine these locomotion skills with the precise control of manipulators. Here, we incorporate external dynamics plans into learning...

Full description

Saved in:
Bibliographic Details
Published inIEEE robotics and automation letters Vol. 7; no. 2; p. 1
Main Authors Ma, Yuntao, Farshidian, Farbod, Miki, Takahiro, Lee, Joonho, Hutter, Marco
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 01.04.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Deep reinforcement learning produces robust locomotion policies for legged robots over challenging terrains. To date, few studies have leveraged model-based methods to combine these locomotion skills with the precise control of manipulators. Here, we incorporate external dynamics plans into learning-based locomotion policies for mobile manipulation. We train the base policy by applying a random wrench sequence on the robot base in simulation and add the noisified wrench sequence prediction to the policy observations. The policy then learns to counteract the partially-known future disturbance. The random wrench sequences are replaced with the wrench prediction generated with the dynamics plans from model predictive control to enable deployment. We show zero-shot adaptation for manipulators unseen during training. On the hardware, we demonstrate stable locomotion of legged robots with the prediction of the external wrench.
AbstractList Deep reinforcement learning produces robust locomotion policies for legged robots over challenging terrains. To date, few studies have leveraged model-based methods to combine these locomotion skills with the precise control of manipulators. Here, we incorporate external dynamics plans into learning-based locomotion policies for mobile manipulation. We train the base policy by applying a random wrench sequence on the robot base in simulation and add the noisified wrench sequence prediction to the policy observations. The policy then learns to counteract the partially-known future disturbance. The random wrench sequences are replaced with the wrench prediction generated with the dynamics plans from model predictive control to enable deployment. We show zero-shot adaptation for manipulators unseen during training. On the hardware, we demonstrate stable locomotion of legged robots with the prediction of the external wrench.
Author Lee, Joonho
Miki, Takahiro
Ma, Yuntao
Hutter, Marco
Farshidian, Farbod
Author_xml – sequence: 1
  givenname: Yuntao
  surname: Ma
  fullname: Ma, Yuntao
  email: mayuntao94@gmail.com
  organization: Department of Mechanical and Process Engineering, ETH Zrich, Zurich, Switzerland, 8049 (e-mail: mayuntao94@gmail.com)
– sequence: 2
  givenname: Farbod
  surname: Farshidian
  fullname: Farshidian, Farbod
  email: farshidian@mavt.ethz.ch
  organization: Department of Mechanical and Process Engineering, ETH Zurich, Zrich, Switzerland, 8051 (e-mail: farshidian@mavt.ethz.ch)
– sequence: 3
  givenname: Takahiro
  surname: Miki
  fullname: Miki, Takahiro
  email: tamiki@ethz.ch
  organization: Department of Mechanical and Process Engineering, ETH Zurich, Zurich, Switzerland, 8092 (e-mail: tamiki@ethz.ch)
– sequence: 4
  givenname: Joonho
  surname: Lee
  fullname: Lee, Joonho
  email: jolee@ethz.ch
  organization: MAVT, ETH Zurich Robotic Systems Laboratory, zurich, Switzerland, 8050 (e-mail: jolee@ethz.ch)
– sequence: 5
  givenname: Marco
  surname: Hutter
  fullname: Hutter, Marco
  email: mahutter@ethz.ch
  organization: Institute of Robotics and Intelligent Systems, ETH Zurich, Zurich, Switzerland, 8092 (e-mail: mahutter@ethz.ch)
BookMark eNp9kN1LwzAUxYNMcM69C74UfO7MR5u0j2P4BR2K6HNI05ua0TUz6ZD997bbUPHBp3tv8jvncs85GrWuBYQuCZ4RgvOb4mU-o5jSGSMJS7k4QWPKhIiZ4Hz0qz9D0xBWGGOSUsHydIzqhVuXtrVtHRWg_NDEpQpQRYXTbu0669ro2TVW76JP271HS1dBc0SWqrWbbaP2kHG-t6jr4d2VtoGfb-fDBTo1qgkwPdYJeru7fV08xMXT_eNiXsSaMdbFAJDw3FDNqxJDmaVZmWtjtBFVqkkJPKGp0hkHyDgVhCpmoB94RhVwrgWboOuD78a7jy2ETq7c1rf9Skk5Zb05T5Ke4gdKexeCByO17fZndF7ZRhIsh1xln6sccpXHXHsh_iPceLtWfvef5Oogsf1x33jOs4SLnH0BEreGwg
CODEN IRALC6
CitedBy_id crossref_primary_10_1126_scirobotics_ade9548
crossref_primary_10_1016_j_robot_2023_104468
crossref_primary_10_1109_TASE_2024_3412111
crossref_primary_10_1109_LRA_2024_3519856
crossref_primary_10_1109_TIE_2023_3306405
crossref_primary_10_1126_scirobotics_adg5014
crossref_primary_10_1109_LRA_2023_3301274
crossref_primary_10_1109_LRA_2022_3189166
crossref_primary_10_1017_S0263574723000371
crossref_primary_10_1115_1_4066852
crossref_primary_10_1177_02783649241312698
crossref_primary_10_1007_s10514_023_10146_0
crossref_primary_10_1109_TCSII_2023_3240458
crossref_primary_10_3390_s23115025
crossref_primary_10_1109_LRA_2024_3458592
crossref_primary_10_1109_LRA_2023_3336109
crossref_primary_10_34133_cbsystems_0203
crossref_primary_10_1109_LRA_2023_3335777
crossref_primary_10_3390_s22218146
crossref_primary_10_3390_s24186070
crossref_primary_10_1002_aisy_202300172
Cites_doi 10.1109/ICRA.2017.7989388
10.1109/LRA.2021.3068908
10.1126/scirobotics.aau5872
10.1126/scirobotics.abk2822
10.1126/scirobotics.abc5986
10.1145/3197517.3201311
10.1109/LRA.2020.2979660
10.1109/ICRA.2019.8794273
10.1016/j.ifacol.2017.08.291
10.1007/978-3-031-21090-7_31
10.15607/rss.2018.xiv.010
10.1117/12.2016000
10.1109/HUMANOIDS.2014.7041375
10.1109/IROS.2016.7758092
10.1109/IROS40897.2019.8967733
10.1109/LRA.2018.2792536
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/LRA.2022.3143567
DatabaseName IEEE Xplore (IEEE)
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2377-3766
EndPage 1
ExternalDocumentID 10_1109_LRA_2022_3143567
9684679
Genre orig-research
GroupedDBID 0R~
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFS
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
IFIPE
IPLJI
JAVBF
KQ8
M43
M~E
O9-
OCL
RIA
RIE
AAYXX
AGSQL
CITATION
EJD
RIG
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c333t-eee469f2c6db0eb858b9cffcf7d5c1be6425ac86ee862712a3fe6ee682ae66c73
IEDL.DBID RIE
ISSN 2377-3766
IngestDate Sun Jun 29 15:57:00 EDT 2025
Tue Jul 01 03:54:09 EDT 2025
Thu Apr 24 23:11:42 EDT 2025
Wed Aug 27 03:01:14 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 2
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c333t-eee469f2c6db0eb858b9cffcf7d5c1be6425ac86ee862712a3fe6ee682ae66c73
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-8574-4978
0000-0002-5072-7385
0000-0002-4285-4990
0000-0001-8556-2819
0000-0001-8269-6272
OpenAccessLink http://hdl.handle.net/20.500.11850/532048
PQID 2623469644
PQPubID 4437225
PageCount 1
ParticipantIDs ieee_primary_9684679
proquest_journals_2623469644
crossref_citationtrail_10_1109_LRA_2022_3143567
crossref_primary_10_1109_LRA_2022_3143567
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2022-04-01
PublicationDateYYYYMMDD 2022-04-01
PublicationDate_xml – month: 04
  year: 2022
  text: 2022-04-01
  day: 01
PublicationDecade 2020
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE robotics and automation letters
PublicationTitleAbbrev LRA
PublicationYear 2022
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref15
ref20
ref11
ref10
ref2
ref1
Yin (ref14) 2020
ref17
ref16
ref19
ref18
ref8
ref7
ref9
ref4
ref3
ref6
Carpentier (ref21)
ref5
Pinto (ref12) 2017
Schulman (ref22) 2017
Finn (ref13) 2017
References_xml – ident: ref2
  doi: 10.1109/ICRA.2017.7989388
– ident: ref6
  doi: 10.1109/LRA.2021.3068908
– start-page: 2015
  ident: ref21
  article-title: Pinocchio: Fast forward and inverse dynamics for poly-articulated systems
– ident: ref7
  doi: 10.1126/scirobotics.aau5872
– volume-title: Proc. Int. Conf. Learn. Representations
  year: 2020
  ident: ref14
  article-title: Meta-learning without memorization
– start-page: 2817
  volume-title: Proc. Int. Conf. Mach. Learn.
  year: 2017
  ident: ref12
  article-title: Robust adversarial reinforcement learning
– year: 2017
  ident: ref22
  article-title: Proximal policy optimization algorithms
– ident: ref10
  doi: 10.1126/scirobotics.abk2822
– ident: ref20
  article-title: OCS2: An open source library for optimal control of switched systems
– ident: ref8
  doi: 10.1126/scirobotics.abc5986
– ident: ref11
  doi: 10.1145/3197517.3201311
– ident: ref16
  doi: 10.1109/LRA.2020.2979660
– ident: ref5
  doi: 10.1109/ICRA.2019.8794273
– start-page: 1126
  volume-title: Proc. Int. Conf. Mach. Learn.
  year: 2017
  ident: ref13
  article-title: Model-agnostic meta-learning for fast adaptation of deep networks
– ident: ref19
  doi: 10.1016/j.ifacol.2017.08.291
– ident: ref15
  doi: 10.1007/978-3-031-21090-7_31
– ident: ref9
  doi: 10.15607/rss.2018.xiv.010
– ident: ref3
  doi: 10.1117/12.2016000
– ident: ref4
  doi: 10.1109/HUMANOIDS.2014.7041375
– ident: ref17
  doi: 10.1109/IROS.2016.7758092
– ident: ref1
  doi: 10.1109/IROS40897.2019.8967733
– ident: ref18
  doi: 10.1109/LRA.2018.2792536
SSID ssj0001527395
Score 2.4802036
Snippet Deep reinforcement learning produces robust locomotion policies for legged robots over challenging terrains. To date, few studies have leveraged model-based...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1
SubjectTerms Costs
Deep learning
Dynamics
Generators
Legged locomotion
Legged Robots
Locomotion
Manipulators
Mobile Manipulation
Policies
Predictive control
Reinforcement Learning
Robot arms
Robot dynamics
Robots
Training
Trajectory
Title Combining Learning-based Locomotion Policy with Model-based Manipulation for Legged Mobile Manipulators
URI https://ieeexplore.ieee.org/document/9684679
https://www.proquest.com/docview/2623469644
Volume 7
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3LS8MwHP6x7aQHX1OczpGDF8Hu0a5JexwyGbJ6EAe7lSRNhjha2bqLB_92f0nbTVTEWx9JSPnS3yOP7wO49plGJ8W5I7gnnaG2i4RUIiDSM4RiTCgzNRA90sls-DD35zW43Z6FUUrZzWeqay7tWn6SyY2ZKuuF1HjLsA51TNyKs1q7-RTDJBb61UpkP-xNn0aY_7kupqUYE1gh-Z3nsVIqP-yvdSr3hxBV3Sn2krx2N7noyvdvTI3_7e8RHJTRJRkVw-EYaio9gf0vnINNWKAFEFYVgpTcqgvHuLKETDOZFaI-pGALJmaSlhi1tGVZJOLpSyX4RTDcxSYWC_M8E2hddq-z1foUZvfj57uJU6otONLzvNzB3mOqrF1JE9FXIvADEUqtpWaJLwdCYaLicxlQpTAJYgOXe1rhDQ1criiVzDuDRpql6hwINUfseYB1KR0qyoQfJoxxz9UB12hTWtCrkIhlSUVuFDGWsU1J-mGM2MUGu7jErgU32xpvBQ3HH2WbBoptuRKFFrQrsOPyP13HLkZ_-NUYFF78XusS9kzbxV6dNjTy1UZdYRiSiw7Uo49xx47CTzYY3bQ
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT4MwGH4z50E9-DWN06kcvJjIPmAUOC7GZSrsYLZkN9KWlhgXMBu7-Ot9W2Azaow3PlooeUrfj7bPA3DjuBKNFKUmozY3-1JPEhKOgHBbEYq5TKjUQDgmo2n_aebManC33gsjhNCLz0RbHeq5_DjjK5Uq6_hEWUt_C7bR7ju9YrfWJqOiuMR8p5qL7Pqd4GWAEaBlYWCKXoGWkt_YHi2m8mME1mZleABh1aBiNclbe5WzNv_4xtX43xYfwn7pXxqDokMcQU2kx7D3hXWwAQmOAUzrQhglu2piKmMWG0HGs0LWxyj4gg2VpjWUXtq8LBLS9LWS_DLQ4cVHJIm6njEcXza3s8XyBKbDh8n9yCz1Fkxu23ZuYusxWJYWJzHrCuY5HvO5lFy6scN7TGCo4lDuESEwDHJ7FrWlwBPiWVQQwl37FOpploozMIjaZE89rEtIXxCXOX7sutS2pEcljipN6FRIRLwkI1eaGPNIByVdP0LsIoVdVGLXhNt1jfeCiOOPsg0FxbpciUITWhXYUfmnLiML_T_8anQLz3-vdQ07o0kYRMHj-PkCdtV7ipU7Lajni5W4RKckZ1e6L34CrDrfzA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Combining+Learning-Based+Locomotion+Policy+With+Model-Based+Manipulation+for+Legged+Mobile+Manipulators&rft.jtitle=IEEE+robotics+and+automation+letters&rft.au=Ma%2C+Yuntao&rft.au=Farshidian%2C+Farbod&rft.au=Miki%2C+Takahiro&rft.au=Lee%2C+Joonho&rft.date=2022-04-01&rft.issn=2377-3766&rft.eissn=2377-3766&rft.volume=7&rft.issue=2&rft.spage=2377&rft.epage=2384&rft_id=info:doi/10.1109%2FLRA.2022.3143567&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_LRA_2022_3143567
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2377-3766&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2377-3766&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2377-3766&client=summon