MORPH: Design Co-optimization with Reinforcement Learning via a Differentiable Hardware Model Proxy

We introduce MORPH, a method for co-optimization of hardware design parameters and control policies in simulation using reinforcement learning. Like most co-optimization methods, MORPH relies on a model of the hardware being optimized, usually simulated based on the laws of physics. However, such a...

Full description

Saved in:
Bibliographic Details
Published in2024 IEEE International Conference on Robotics and Automation (ICRA) pp. 7764 - 7771
Main Authors He, Zhanpeng, Ciocarlie, Matei
Format Conference Proceeding
LanguageEnglish
Published IEEE 13.05.2024
Subjects
Online AccessGet full text

Cover

Loading…
Abstract We introduce MORPH, a method for co-optimization of hardware design parameters and control policies in simulation using reinforcement learning. Like most co-optimization methods, MORPH relies on a model of the hardware being optimized, usually simulated based on the laws of physics. However, such a model is often difficult to integrate into an effective optimization routine. To address this, we introduce a proxy hardware model, which is always differentiable and enables efficient co-optimization alongside a long-horizon control policy using RL. MORPH is designed to ensure that the optimized hardware proxy remains as close as possible to its realistic counterpart, while still enabling task completion. We demonstrate our approach on simulated 2D reaching and 3D multi-fingered manipulation tasks.
AbstractList We introduce MORPH, a method for co-optimization of hardware design parameters and control policies in simulation using reinforcement learning. Like most co-optimization methods, MORPH relies on a model of the hardware being optimized, usually simulated based on the laws of physics. However, such a model is often difficult to integrate into an effective optimization routine. To address this, we introduce a proxy hardware model, which is always differentiable and enables efficient co-optimization alongside a long-horizon control policy using RL. MORPH is designed to ensure that the optimized hardware proxy remains as close as possible to its realistic counterpart, while still enabling task completion. We demonstrate our approach on simulated 2D reaching and 3D multi-fingered manipulation tasks.
Author He, Zhanpeng
Ciocarlie, Matei
Author_xml – sequence: 1
  givenname: Zhanpeng
  surname: He
  fullname: He, Zhanpeng
  email: zhanpeng@cs.columbia.edu
  organization: Columbia University,Department of Computer Science,New York,USA
– sequence: 2
  givenname: Matei
  surname: Ciocarlie
  fullname: Ciocarlie, Matei
  email: matei.ciocarlie@columbia.edu
  organization: Columbia University,Department of Mechanical Engineering,New York,USA
BookMark eNqFzruKwkAUgOERtFgvb7Cw5wXMzuSyY-wkKhEMSrAPx3iiB5IZGYOXfXq3cGurv_iavy-6xhoS4ktJTykZf6-SfBZpFWrPl37oKfmjpA78jhjFOp4EkQwmYaTDD1Fmm3ybTmFOFz4aSOzYnltu-BdbtgZu3J4gJzaVdSU1ZFpYEzrD5ghXRkCYc1WR-wPGfU2Qojvc0BFk9kA1bJ29P4aiV2F9odGrA_G5XOySdMxEVJwdN-gexf9j8IafoElFog
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ICRA57147.2024.10610732
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Physics
EISBN 9798350384574
EndPage 7771
ExternalDocumentID 10610732
Genre orig-research
GroupedDBID 6IE
6IH
CBEJK
RIE
RIO
ID FETCH-ieee_primary_106107323
IEDL.DBID RIE
IngestDate Wed Aug 14 05:40:32 EDT 2024
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-ieee_primary_106107323
ParticipantIDs ieee_primary_10610732
PublicationCentury 2000
PublicationDate 2024-May-13
PublicationDateYYYYMMDD 2024-05-13
PublicationDate_xml – month: 05
  year: 2024
  text: 2024-May-13
  day: 13
PublicationDecade 2020
PublicationTitle 2024 IEEE International Conference on Robotics and Automation (ICRA)
PublicationTitleAbbrev ICRA
PublicationYear 2024
Publisher IEEE
Publisher_xml – name: IEEE
Score 3.8393795
Snippet We introduce MORPH, a method for co-optimization of hardware design parameters and control policies in simulation using reinforcement learning. Like most...
SourceID ieee
SourceType Publisher
StartPage 7764
SubjectTerms Hardware
Optimization
Physics
Reinforcement learning
Robotics and automation
Task analysis
Three-dimensional displays
Title MORPH: Design Co-optimization with Reinforcement Learning via a Differentiable Hardware Model Proxy
URI https://ieeexplore.ieee.org/document/10610732
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1bS8MwFD7oQPDJW8XLlDz42q6XtMl8k85Rhc1RFPY2mstEnK3M1tuvN0lbRVHwLZTQHJo0h3PO930H4MTvi4BRQW0SEqICFM5spsJl25t7VOCQEWnU-UfjKLnBl9Nw2pDVDRdGSmnAZ9LRQ1PLFwWvdKqsp8MXdSTVjbtKXb8mazWYLc_t9y7i9CwkHiYq7POx087-1jfFuI3hBozbBWu0yL1Tlczh7z-0GP9t0SZYXww9NPn0PVuwIvNtWDNwTv60A3x0lU6SUzQw-AwUF3ahroaHhnOJdPIVpdKIpnKTH0SNzuoter7LUIYGTd8U9f-zhUS6vv-SLSXSrdMWeuXXNwu6w_PrOLG1vbPHWrVi1poa7EInL3K5B0i7KqE7FUlXYM5DRv0AR9TP5jziQkT7YP36ioM_nh_Cuv7yurjuBV3olMtKHimfXbJjs1cfQDiblg
link.rule.ids 310,311,786,790,795,796,802,27958,55109
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8NAEB6kInryFfFRdQ9ekzbv1JukllSbWEKF3kL20SLWRGri69e7u0kURcFbNoTdgQ0zzMz3fQNwZvSoiT3qqa7tujxBIVjFPF1W9ZnuUcvGLpPq_GHkBLfW1dSe1mR1yYVhjEnwGdPEo-zl05yUolTWEekL_yW5x13lgb7bq-haNWqLrztDP76wXd1yeeJnWFrz_bfJKTJwDDYhao6s8CL3Wllgjbz_UGP8t01boHxx9ND4M_pswwrLdmBNAjrJ0y6Q8CYeB-eoLxEayM_VnDuHh5p1iUT5FcVMyqYSWSFEtdLqHD3fpShF_XpyCvcAeMGQ6PC_pEuGxPC0hTj59U2B9uBy4geqsDd5rHQrksZUcw9aWZ6xfUAiWFExq4h1qUWIjT3DtBzPSGfEIZQ6B6D8usXhH-9PYT2YhKNkNIyuj2BD3IJotetmG1rFsmTHPIIX-ETe2weiEZ7s
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2024+IEEE+International+Conference+on+Robotics+and+Automation+%28ICRA%29&rft.atitle=MORPH%3A+Design+Co-optimization+with+Reinforcement+Learning+via+a+Differentiable+Hardware+Model+Proxy&rft.au=He%2C+Zhanpeng&rft.au=Ciocarlie%2C+Matei&rft.date=2024-05-13&rft.pub=IEEE&rft.spage=7764&rft.epage=7771&rft_id=info:doi/10.1109%2FICRA57147.2024.10610732&rft.externalDocID=10610732