Combining Learning-based Locomotion Policy with Model-based Manipulation for Legged Mobile Manipulators

Deep reinforcement learning produces robust locomotion policies for legged robots over challenging terrains. To date, few studies have leveraged model-based methods to combine these locomotion skills with the precise control of manipulators. Here, we incorporate external dynamics plans into learning...

Full description

Saved in:

Bibliographic Details
Published in	IEEE robotics and automation letters Vol. 7; no. 2; p. 1
Main Authors	Ma, Yuntao, Farshidian, Farbod, Miki, Takahiro, Lee, Joonho, Hutter, Marco
Format	Journal Article
Language	English
Published	Piscataway IEEE 01.04.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Costs Deep learning Dynamics Generators Legged locomotion Legged Robots Locomotion Manipulators Mobile Manipulation Policies Predictive control Reinforcement Learning Robot arms Robot dynamics Robots Training Trajectory
Online Access	Get full text

Cover

Loading…

Abstract	Deep reinforcement learning produces robust locomotion policies for legged robots over challenging terrains. To date, few studies have leveraged model-based methods to combine these locomotion skills with the precise control of manipulators. Here, we incorporate external dynamics plans into learning-based locomotion policies for mobile manipulation. We train the base policy by applying a random wrench sequence on the robot base in simulation and add the noisified wrench sequence prediction to the policy observations. The policy then learns to counteract the partially-known future disturbance. The random wrench sequences are replaced with the wrench prediction generated with the dynamics plans from model predictive control to enable deployment. We show zero-shot adaptation for manipulators unseen during training. On the hardware, we demonstrate stable locomotion of legged robots with the prediction of the external wrench.
AbstractList	Deep reinforcement learning produces robust locomotion policies for legged robots over challenging terrains. To date, few studies have leveraged model-based methods to combine these locomotion skills with the precise control of manipulators. Here, we incorporate external dynamics plans into learning-based locomotion policies for mobile manipulation. We train the base policy by applying a random wrench sequence on the robot base in simulation and add the noisified wrench sequence prediction to the policy observations. The policy then learns to counteract the partially-known future disturbance. The random wrench sequences are replaced with the wrench prediction generated with the dynamics plans from model predictive control to enable deployment. We show zero-shot adaptation for manipulators unseen during training. On the hardware, we demonstrate stable locomotion of legged robots with the prediction of the external wrench.
Author	Lee, Joonho Miki, Takahiro Ma, Yuntao Hutter, Marco Farshidian, Farbod
Author_xml	– sequence: 1 givenname: Yuntao surname: Ma fullname: Ma, Yuntao email: mayuntao94@gmail.com organization: Department of Mechanical and Process Engineering, ETH Zrich, Zurich, Switzerland, 8049 (e-mail: mayuntao94@gmail.com) – sequence: 2 givenname: Farbod surname: Farshidian fullname: Farshidian, Farbod email: farshidian@mavt.ethz.ch organization: Department of Mechanical and Process Engineering, ETH Zurich, Zrich, Switzerland, 8051 (e-mail: farshidian@mavt.ethz.ch) – sequence: 3 givenname: Takahiro surname: Miki fullname: Miki, Takahiro email: tamiki@ethz.ch organization: Department of Mechanical and Process Engineering, ETH Zurich, Zurich, Switzerland, 8092 (e-mail: tamiki@ethz.ch) – sequence: 4 givenname: Joonho surname: Lee fullname: Lee, Joonho email: jolee@ethz.ch organization: MAVT, ETH Zurich Robotic Systems Laboratory, zurich, Switzerland, 8050 (e-mail: jolee@ethz.ch) – sequence: 5 givenname: Marco surname: Hutter fullname: Hutter, Marco email: mahutter@ethz.ch organization: Institute of Robotics and Intelligent Systems, ETH Zurich, Zurich, Switzerland, 8092 (e-mail: mahutter@ethz.ch)
BookMark	eNp9kN1LwzAUxYNMcM69C74UfO7MR5u0j2P4BR2K6HNI05ua0TUz6ZD997bbUPHBp3tv8jvncs85GrWuBYQuCZ4RgvOb4mU-o5jSGSMJS7k4QWPKhIiZ4Hz0qz9D0xBWGGOSUsHydIzqhVuXtrVtHRWg_NDEpQpQRYXTbu0669ro2TVW76JP271HS1dBc0SWqrWbbaP2kHG-t6jr4d2VtoGfb-fDBTo1qgkwPdYJeru7fV08xMXT_eNiXsSaMdbFAJDw3FDNqxJDmaVZmWtjtBFVqkkJPKGp0hkHyDgVhCpmoB94RhVwrgWboOuD78a7jy2ETq7c1rf9Skk5Zb05T5Ke4gdKexeCByO17fZndF7ZRhIsh1xln6sccpXHXHsh_iPceLtWfvef5Oogsf1x33jOs4SLnH0BEreGwg
CODEN	IRALC6
CitedBy_id	crossref_primary_10_1126_scirobotics_ade9548 crossref_primary_10_1016_j_robot_2023_104468 crossref_primary_10_1109_TASE_2024_3412111 crossref_primary_10_1109_LRA_2024_3519856 crossref_primary_10_1109_TIE_2023_3306405 crossref_primary_10_1126_scirobotics_adg5014 crossref_primary_10_1109_LRA_2023_3301274 crossref_primary_10_1109_LRA_2022_3189166 crossref_primary_10_1017_S0263574723000371 crossref_primary_10_1115_1_4066852 crossref_primary_10_1177_02783649241312698 crossref_primary_10_1007_s10514_023_10146_0 crossref_primary_10_1109_TCSII_2023_3240458 crossref_primary_10_3390_s23115025 crossref_primary_10_1109_LRA_2024_3458592 crossref_primary_10_1109_LRA_2023_3336109 crossref_primary_10_34133_cbsystems_0203 crossref_primary_10_1109_LRA_2023_3335777 crossref_primary_10_3390_s22218146 crossref_primary_10_3390_s24186070 crossref_primary_10_1002_aisy_202300172
Cites_doi	10.1109/ICRA.2017.7989388 10.1109/LRA.2021.3068908 10.1126/scirobotics.aau5872 10.1126/scirobotics.abk2822 10.1126/scirobotics.abc5986 10.1145/3197517.3201311 10.1109/LRA.2020.2979660 10.1109/ICRA.2019.8794273 10.1016/j.ifacol.2017.08.291 10.1007/978-3-031-21090-7_31 10.15607/rss.2018.xiv.010 10.1117/12.2016000 10.1109/HUMANOIDS.2014.7041375 10.1109/IROS.2016.7758092 10.1109/IROS40897.2019.8967733 10.1109/LRA.2018.2792536
ContentType	Journal Article
Copyright	Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
Copyright_xml	– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
DBID	97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D
DOI	10.1109/LRA.2022.3143567
DatabaseName	IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional
DatabaseTitle	CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional
DatabaseTitleList	Technology Research Database
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	2377-3766
EndPage	1
ExternalDocumentID	10_1109_LRA_2022_3143567 9684679
Genre	orig-research
GroupedDBID	0R~ 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFS AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS IFIPE IPLJI JAVBF KQ8 M43 M~E O9- OCL RIA RIE AAYXX AGSQL CITATION EJD RIG 7SC 7SP 8FD JQ2 L7M L~C L~D
ID	FETCH-LOGICAL-c333t-eee469f2c6db0eb858b9cffcf7d5c1be6425ac86ee862712a3fe6ee682ae66c73
IEDL.DBID	RIE
ISSN	2377-3766
IngestDate	Sun Jun 29 15:57:00 EDT 2025 Tue Jul 01 03:54:09 EDT 2025 Thu Apr 24 23:11:42 EDT 2025 Wed Aug 27 03:01:14 EDT 2025
IsDoiOpenAccess	false
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	2
Language	English
License	https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c333t-eee469f2c6db0eb858b9cffcf7d5c1be6425ac86ee862712a3fe6ee682ae66c73
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0002-8574-4978 0000-0002-5072-7385 0000-0002-4285-4990 0000-0001-8556-2819 0000-0001-8269-6272
OpenAccessLink	http://hdl.handle.net/20.500.11850/532048
PQID	2623469644
PQPubID	4437225
PageCount	1
ParticipantIDs	ieee_primary_9684679 proquest_journals_2623469644 crossref_citationtrail_10_1109_LRA_2022_3143567 crossref_primary_10_1109_LRA_2022_3143567
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2022-04-01
PublicationDateYYYYMMDD	2022-04-01
PublicationDate_xml	– month: 04 year: 2022 text: 2022-04-01 day: 01
PublicationDecade	2020
PublicationPlace	Piscataway
PublicationPlace_xml	– name: Piscataway
PublicationTitle	IEEE robotics and automation letters
PublicationTitleAbbrev	LRA
PublicationYear	2022
Publisher	IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml	– name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References	ref15 ref20 ref11 ref10 ref2 ref1 Yin (ref14) 2020 ref17 ref16 ref19 ref18 ref8 ref7 ref9 ref4 ref3 ref6 Carpentier (ref21) ref5 Pinto (ref12) 2017 Schulman (ref22) 2017 Finn (ref13) 2017
References_xml	– ident: ref2 doi: 10.1109/ICRA.2017.7989388 – ident: ref6 doi: 10.1109/LRA.2021.3068908 – start-page: 2015 ident: ref21 article-title: Pinocchio: Fast forward and inverse dynamics for poly-articulated systems – ident: ref7 doi: 10.1126/scirobotics.aau5872 – volume-title: Proc. Int. Conf. Learn. Representations year: 2020 ident: ref14 article-title: Meta-learning without memorization – start-page: 2817 volume-title: Proc. Int. Conf. Mach. Learn. year: 2017 ident: ref12 article-title: Robust adversarial reinforcement learning – year: 2017 ident: ref22 article-title: Proximal policy optimization algorithms – ident: ref10 doi: 10.1126/scirobotics.abk2822 – ident: ref20 article-title: OCS2: An open source library for optimal control of switched systems – ident: ref8 doi: 10.1126/scirobotics.abc5986 – ident: ref11 doi: 10.1145/3197517.3201311 – ident: ref16 doi: 10.1109/LRA.2020.2979660 – ident: ref5 doi: 10.1109/ICRA.2019.8794273 – start-page: 1126 volume-title: Proc. Int. Conf. Mach. Learn. year: 2017 ident: ref13 article-title: Model-agnostic meta-learning for fast adaptation of deep networks – ident: ref19 doi: 10.1016/j.ifacol.2017.08.291 – ident: ref15 doi: 10.1007/978-3-031-21090-7_31 – ident: ref9 doi: 10.15607/rss.2018.xiv.010 – ident: ref3 doi: 10.1117/12.2016000 – ident: ref4 doi: 10.1109/HUMANOIDS.2014.7041375 – ident: ref17 doi: 10.1109/IROS.2016.7758092 – ident: ref1 doi: 10.1109/IROS40897.2019.8967733 – ident: ref18 doi: 10.1109/LRA.2018.2792536
SSID	ssj0001527395
Score	2.4802036
Snippet	Deep reinforcement learning produces robust locomotion policies for legged robots over challenging terrains. To date, few studies have leveraged model-based...
SourceID	proquest crossref ieee
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	1
SubjectTerms	Costs Deep learning Dynamics Generators Legged locomotion Legged Robots Locomotion Manipulators Mobile Manipulation Policies Predictive control Reinforcement Learning Robot arms Robot dynamics Robots Training Trajectory
Title	Combining Learning-based Locomotion Policy with Model-based Manipulation for Legged Mobile Manipulators
URI	https://ieeexplore.ieee.org/document/9684679 https://www.proquest.com/docview/2623469644
Volume	7
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3LS8MwHP6x7aQHX1OczpGDF8Hu0a5JexwyGbJ6EAe7lSRNhjha2bqLB_92f0nbTVTEWx9JSPnS3yOP7wO49plGJ8W5I7gnnaG2i4RUIiDSM4RiTCgzNRA90sls-DD35zW43Z6FUUrZzWeqay7tWn6SyY2ZKuuF1HjLsA51TNyKs1q7-RTDJBb61UpkP-xNn0aY_7kupqUYE1gh-Z3nsVIqP-yvdSr3hxBV3Sn2krx2N7noyvdvTI3_7e8RHJTRJRkVw-EYaio9gf0vnINNWKAFEFYVgpTcqgvHuLKETDOZFaI-pGALJmaSlhi1tGVZJOLpSyX4RTDcxSYWC_M8E2hddq-z1foUZvfj57uJU6otONLzvNzB3mOqrF1JE9FXIvADEUqtpWaJLwdCYaLicxlQpTAJYgOXe1rhDQ1criiVzDuDRpql6hwINUfseYB1KR0qyoQfJoxxz9UB12hTWtCrkIhlSUVuFDGWsU1J-mGM2MUGu7jErgU32xpvBQ3HH2WbBoptuRKFFrQrsOPyP13HLkZ_-NUYFF78XusS9kzbxV6dNjTy1UZdYRiSiw7Uo49xx47CTzYY3bQ
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT4MwGH4z50E9-DWN06kcvJjIPmAUOC7GZSrsYLZkN9KWlhgXMBu7-Ot9W2Azaow3PlooeUrfj7bPA3DjuBKNFKUmozY3-1JPEhKOgHBbEYq5TKjUQDgmo2n_aebManC33gsjhNCLz0RbHeq5_DjjK5Uq6_hEWUt_C7bR7ju9YrfWJqOiuMR8p5qL7Pqd4GWAEaBlYWCKXoGWkt_YHi2m8mME1mZleABh1aBiNclbe5WzNv_4xtX43xYfwn7pXxqDokMcQU2kx7D3hXWwAQmOAUzrQhglu2piKmMWG0HGs0LWxyj4gg2VpjWUXtq8LBLS9LWS_DLQ4cVHJIm6njEcXza3s8XyBKbDh8n9yCz1Fkxu23ZuYusxWJYWJzHrCuY5HvO5lFy6scN7TGCo4lDuESEwDHJ7FrWlwBPiWVQQwl37FOpploozMIjaZE89rEtIXxCXOX7sutS2pEcljipN6FRIRLwkI1eaGPNIByVdP0LsIoVdVGLXhNt1jfeCiOOPsg0FxbpciUITWhXYUfmnLiML_T_8anQLz3-vdQ07o0kYRMHj-PkCdtV7ipU7Lajni5W4RKckZ1e6L34CrDrfzA
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Combining+Learning-Based+Locomotion+Policy+With+Model-Based+Manipulation+for+Legged+Mobile+Manipulators&rft.jtitle=IEEE+robotics+and+automation+letters&rft.au=Ma%2C+Yuntao&rft.au=Farshidian%2C+Farbod&rft.au=Miki%2C+Takahiro&rft.au=Lee%2C+Joonho&rft.date=2022-04-01&rft.issn=2377-3766&rft.eissn=2377-3766&rft.volume=7&rft.issue=2&rft.spage=2377&rft.epage=2384&rft_id=info:doi/10.1109%2FLRA.2022.3143567&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_LRA_2022_3143567
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2377-3766&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2377-3766&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2377-3766&client=summon