Combining Learning-based Locomotion Policy with Model-based Manipulation for Legged Mobile Manipulators
Deep reinforcement learning produces robust locomotion policies for legged robots over challenging terrains. To date, few studies have leveraged model-based methods to combine these locomotion skills with the precise control of manipulators. Here, we incorporate external dynamics plans into learning...
Saved in:
Published in | IEEE robotics and automation letters Vol. 7; no. 2; p. 1 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
01.04.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Deep reinforcement learning produces robust locomotion policies for legged robots over challenging terrains. To date, few studies have leveraged model-based methods to combine these locomotion skills with the precise control of manipulators. Here, we incorporate external dynamics plans into learning-based locomotion policies for mobile manipulation. We train the base policy by applying a random wrench sequence on the robot base in simulation and add the noisified wrench sequence prediction to the policy observations. The policy then learns to counteract the partially-known future disturbance. The random wrench sequences are replaced with the wrench prediction generated with the dynamics plans from model predictive control to enable deployment. We show zero-shot adaptation for manipulators unseen during training. On the hardware, we demonstrate stable locomotion of legged robots with the prediction of the external wrench. |
---|---|
AbstractList | Deep reinforcement learning produces robust locomotion policies for legged robots over challenging terrains. To date, few studies have leveraged model-based methods to combine these locomotion skills with the precise control of manipulators. Here, we incorporate external dynamics plans into learning-based locomotion policies for mobile manipulation. We train the base policy by applying a random wrench sequence on the robot base in simulation and add the noisified wrench sequence prediction to the policy observations. The policy then learns to counteract the partially-known future disturbance. The random wrench sequences are replaced with the wrench prediction generated with the dynamics plans from model predictive control to enable deployment. We show zero-shot adaptation for manipulators unseen during training. On the hardware, we demonstrate stable locomotion of legged robots with the prediction of the external wrench. |
Author | Lee, Joonho Miki, Takahiro Ma, Yuntao Hutter, Marco Farshidian, Farbod |
Author_xml | – sequence: 1 givenname: Yuntao surname: Ma fullname: Ma, Yuntao email: mayuntao94@gmail.com organization: Department of Mechanical and Process Engineering, ETH Zrich, Zurich, Switzerland, 8049 (e-mail: mayuntao94@gmail.com) – sequence: 2 givenname: Farbod surname: Farshidian fullname: Farshidian, Farbod email: farshidian@mavt.ethz.ch organization: Department of Mechanical and Process Engineering, ETH Zurich, Zrich, Switzerland, 8051 (e-mail: farshidian@mavt.ethz.ch) – sequence: 3 givenname: Takahiro surname: Miki fullname: Miki, Takahiro email: tamiki@ethz.ch organization: Department of Mechanical and Process Engineering, ETH Zurich, Zurich, Switzerland, 8092 (e-mail: tamiki@ethz.ch) – sequence: 4 givenname: Joonho surname: Lee fullname: Lee, Joonho email: jolee@ethz.ch organization: MAVT, ETH Zurich Robotic Systems Laboratory, zurich, Switzerland, 8050 (e-mail: jolee@ethz.ch) – sequence: 5 givenname: Marco surname: Hutter fullname: Hutter, Marco email: mahutter@ethz.ch organization: Institute of Robotics and Intelligent Systems, ETH Zurich, Zurich, Switzerland, 8092 (e-mail: mahutter@ethz.ch) |
BookMark | eNp9kN1LwzAUxYNMcM69C74UfO7MR5u0j2P4BR2K6HNI05ua0TUz6ZD997bbUPHBp3tv8jvncs85GrWuBYQuCZ4RgvOb4mU-o5jSGSMJS7k4QWPKhIiZ4Hz0qz9D0xBWGGOSUsHydIzqhVuXtrVtHRWg_NDEpQpQRYXTbu0669ro2TVW76JP271HS1dBc0SWqrWbbaP2kHG-t6jr4d2VtoGfb-fDBTo1qgkwPdYJeru7fV08xMXT_eNiXsSaMdbFAJDw3FDNqxJDmaVZmWtjtBFVqkkJPKGp0hkHyDgVhCpmoB94RhVwrgWboOuD78a7jy2ETq7c1rf9Skk5Zb05T5Ke4gdKexeCByO17fZndF7ZRhIsh1xln6sccpXHXHsh_iPceLtWfvef5Oogsf1x33jOs4SLnH0BEreGwg |
CODEN | IRALC6 |
CitedBy_id | crossref_primary_10_1126_scirobotics_ade9548 crossref_primary_10_1016_j_robot_2023_104468 crossref_primary_10_1109_TASE_2024_3412111 crossref_primary_10_1109_LRA_2024_3519856 crossref_primary_10_1109_TIE_2023_3306405 crossref_primary_10_1126_scirobotics_adg5014 crossref_primary_10_1109_LRA_2023_3301274 crossref_primary_10_1109_LRA_2022_3189166 crossref_primary_10_1017_S0263574723000371 crossref_primary_10_1115_1_4066852 crossref_primary_10_1177_02783649241312698 crossref_primary_10_1007_s10514_023_10146_0 crossref_primary_10_1109_TCSII_2023_3240458 crossref_primary_10_3390_s23115025 crossref_primary_10_1109_LRA_2024_3458592 crossref_primary_10_1109_LRA_2023_3336109 crossref_primary_10_34133_cbsystems_0203 crossref_primary_10_1109_LRA_2023_3335777 crossref_primary_10_3390_s22218146 crossref_primary_10_3390_s24186070 crossref_primary_10_1002_aisy_202300172 |
Cites_doi | 10.1109/ICRA.2017.7989388 10.1109/LRA.2021.3068908 10.1126/scirobotics.aau5872 10.1126/scirobotics.abk2822 10.1126/scirobotics.abc5986 10.1145/3197517.3201311 10.1109/LRA.2020.2979660 10.1109/ICRA.2019.8794273 10.1016/j.ifacol.2017.08.291 10.1007/978-3-031-21090-7_31 10.15607/rss.2018.xiv.010 10.1117/12.2016000 10.1109/HUMANOIDS.2014.7041375 10.1109/IROS.2016.7758092 10.1109/IROS40897.2019.8967733 10.1109/LRA.2018.2792536 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022 |
DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
DOI | 10.1109/LRA.2022.3143567 |
DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Technology Research Database |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 2377-3766 |
EndPage | 1 |
ExternalDocumentID | 10_1109_LRA_2022_3143567 9684679 |
Genre | orig-research |
GroupedDBID | 0R~ 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFS AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS IFIPE IPLJI JAVBF KQ8 M43 M~E O9- OCL RIA RIE AAYXX AGSQL CITATION EJD RIG 7SC 7SP 8FD JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c333t-eee469f2c6db0eb858b9cffcf7d5c1be6425ac86ee862712a3fe6ee682ae66c73 |
IEDL.DBID | RIE |
ISSN | 2377-3766 |
IngestDate | Sun Jun 29 15:57:00 EDT 2025 Tue Jul 01 03:54:09 EDT 2025 Thu Apr 24 23:11:42 EDT 2025 Wed Aug 27 03:01:14 EDT 2025 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 2 |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c333t-eee469f2c6db0eb858b9cffcf7d5c1be6425ac86ee862712a3fe6ee682ae66c73 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0002-8574-4978 0000-0002-5072-7385 0000-0002-4285-4990 0000-0001-8556-2819 0000-0001-8269-6272 |
OpenAccessLink | http://hdl.handle.net/20.500.11850/532048 |
PQID | 2623469644 |
PQPubID | 4437225 |
PageCount | 1 |
ParticipantIDs | ieee_primary_9684679 proquest_journals_2623469644 crossref_citationtrail_10_1109_LRA_2022_3143567 crossref_primary_10_1109_LRA_2022_3143567 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2022-04-01 |
PublicationDateYYYYMMDD | 2022-04-01 |
PublicationDate_xml | – month: 04 year: 2022 text: 2022-04-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | Piscataway |
PublicationPlace_xml | – name: Piscataway |
PublicationTitle | IEEE robotics and automation letters |
PublicationTitleAbbrev | LRA |
PublicationYear | 2022 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref15 ref20 ref11 ref10 ref2 ref1 Yin (ref14) 2020 ref17 ref16 ref19 ref18 ref8 ref7 ref9 ref4 ref3 ref6 Carpentier (ref21) ref5 Pinto (ref12) 2017 Schulman (ref22) 2017 Finn (ref13) 2017 |
References_xml | – ident: ref2 doi: 10.1109/ICRA.2017.7989388 – ident: ref6 doi: 10.1109/LRA.2021.3068908 – start-page: 2015 ident: ref21 article-title: Pinocchio: Fast forward and inverse dynamics for poly-articulated systems – ident: ref7 doi: 10.1126/scirobotics.aau5872 – volume-title: Proc. Int. Conf. Learn. Representations year: 2020 ident: ref14 article-title: Meta-learning without memorization – start-page: 2817 volume-title: Proc. Int. Conf. Mach. Learn. year: 2017 ident: ref12 article-title: Robust adversarial reinforcement learning – year: 2017 ident: ref22 article-title: Proximal policy optimization algorithms – ident: ref10 doi: 10.1126/scirobotics.abk2822 – ident: ref20 article-title: OCS2: An open source library for optimal control of switched systems – ident: ref8 doi: 10.1126/scirobotics.abc5986 – ident: ref11 doi: 10.1145/3197517.3201311 – ident: ref16 doi: 10.1109/LRA.2020.2979660 – ident: ref5 doi: 10.1109/ICRA.2019.8794273 – start-page: 1126 volume-title: Proc. Int. Conf. Mach. Learn. year: 2017 ident: ref13 article-title: Model-agnostic meta-learning for fast adaptation of deep networks – ident: ref19 doi: 10.1016/j.ifacol.2017.08.291 – ident: ref15 doi: 10.1007/978-3-031-21090-7_31 – ident: ref9 doi: 10.15607/rss.2018.xiv.010 – ident: ref3 doi: 10.1117/12.2016000 – ident: ref4 doi: 10.1109/HUMANOIDS.2014.7041375 – ident: ref17 doi: 10.1109/IROS.2016.7758092 – ident: ref1 doi: 10.1109/IROS40897.2019.8967733 – ident: ref18 doi: 10.1109/LRA.2018.2792536 |
SSID | ssj0001527395 |
Score | 2.4802036 |
Snippet | Deep reinforcement learning produces robust locomotion policies for legged robots over challenging terrains. To date, few studies have leveraged model-based... |
SourceID | proquest crossref ieee |
SourceType | Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 1 |
SubjectTerms | Costs Deep learning Dynamics Generators Legged locomotion Legged Robots Locomotion Manipulators Mobile Manipulation Policies Predictive control Reinforcement Learning Robot arms Robot dynamics Robots Training Trajectory |
Title | Combining Learning-based Locomotion Policy with Model-based Manipulation for Legged Mobile Manipulators |
URI | https://ieeexplore.ieee.org/document/9684679 https://www.proquest.com/docview/2623469644 |
Volume | 7 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3LS8MwHP6x7aQHX1OczpGDF8Hu0a5JexwyGbJ6EAe7lSRNhjha2bqLB_92f0nbTVTEWx9JSPnS3yOP7wO49plGJ8W5I7gnnaG2i4RUIiDSM4RiTCgzNRA90sls-DD35zW43Z6FUUrZzWeqay7tWn6SyY2ZKuuF1HjLsA51TNyKs1q7-RTDJBb61UpkP-xNn0aY_7kupqUYE1gh-Z3nsVIqP-yvdSr3hxBV3Sn2krx2N7noyvdvTI3_7e8RHJTRJRkVw-EYaio9gf0vnINNWKAFEFYVgpTcqgvHuLKETDOZFaI-pGALJmaSlhi1tGVZJOLpSyX4RTDcxSYWC_M8E2hddq-z1foUZvfj57uJU6otONLzvNzB3mOqrF1JE9FXIvADEUqtpWaJLwdCYaLicxlQpTAJYgOXe1rhDQ1criiVzDuDRpql6hwINUfseYB1KR0qyoQfJoxxz9UB12hTWtCrkIhlSUVuFDGWsU1J-mGM2MUGu7jErgU32xpvBQ3HH2WbBoptuRKFFrQrsOPyP13HLkZ_-NUYFF78XusS9kzbxV6dNjTy1UZdYRiSiw7Uo49xx47CTzYY3bQ |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT4MwGH4z50E9-DWN06kcvJjIPmAUOC7GZSrsYLZkN9KWlhgXMBu7-Ot9W2Azaow3PlooeUrfj7bPA3DjuBKNFKUmozY3-1JPEhKOgHBbEYq5TKjUQDgmo2n_aebManC33gsjhNCLz0RbHeq5_DjjK5Uq6_hEWUt_C7bR7ju9YrfWJqOiuMR8p5qL7Pqd4GWAEaBlYWCKXoGWkt_YHi2m8mME1mZleABh1aBiNclbe5WzNv_4xtX43xYfwn7pXxqDokMcQU2kx7D3hXWwAQmOAUzrQhglu2piKmMWG0HGs0LWxyj4gg2VpjWUXtq8LBLS9LWS_DLQ4cVHJIm6njEcXza3s8XyBKbDh8n9yCz1Fkxu23ZuYusxWJYWJzHrCuY5HvO5lFy6scN7TGCo4lDuESEwDHJ7FrWlwBPiWVQQwl37FOpploozMIjaZE89rEtIXxCXOX7sutS2pEcljipN6FRIRLwkI1eaGPNIByVdP0LsIoVdVGLXhNt1jfeCiOOPsg0FxbpciUITWhXYUfmnLiML_T_8anQLz3-vdQ07o0kYRMHj-PkCdtV7ipU7Lajni5W4RKckZ1e6L34CrDrfzA |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Combining+Learning-Based+Locomotion+Policy+With+Model-Based+Manipulation+for+Legged+Mobile+Manipulators&rft.jtitle=IEEE+robotics+and+automation+letters&rft.au=Ma%2C+Yuntao&rft.au=Farshidian%2C+Farbod&rft.au=Miki%2C+Takahiro&rft.au=Lee%2C+Joonho&rft.date=2022-04-01&rft.issn=2377-3766&rft.eissn=2377-3766&rft.volume=7&rft.issue=2&rft.spage=2377&rft.epage=2384&rft_id=info:doi/10.1109%2FLRA.2022.3143567&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_LRA_2022_3143567 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2377-3766&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2377-3766&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2377-3766&client=summon |