MORPH: Design Co-optimization with Reinforcement Learning via a Differentiable Hardware Model Proxy

We introduce MORPH, a method for co-optimization of hardware design parameters and control policies in simulation using reinforcement learning. Like most co-optimization methods, MORPH relies on a model of the hardware being optimized, usually simulated based on the laws of physics. However, such a...

Full description

Saved in:

Bibliographic Details
Published in	2024 IEEE International Conference on Robotics and Automation (ICRA) pp. 7764 - 7771
Main Authors	He, Zhanpeng, Ciocarlie, Matei
Format	Conference Proceeding
Language	English
Published	IEEE 13.05.2024
Subjects	Hardware Optimization Physics Reinforcement learning Robotics and automation Task analysis Three-dimensional displays
Online Access	Get full text

Cover

Loading…

Abstract	We introduce MORPH, a method for co-optimization of hardware design parameters and control policies in simulation using reinforcement learning. Like most co-optimization methods, MORPH relies on a model of the hardware being optimized, usually simulated based on the laws of physics. However, such a model is often difficult to integrate into an effective optimization routine. To address this, we introduce a proxy hardware model, which is always differentiable and enables efficient co-optimization alongside a long-horizon control policy using RL. MORPH is designed to ensure that the optimized hardware proxy remains as close as possible to its realistic counterpart, while still enabling task completion. We demonstrate our approach on simulated 2D reaching and 3D multi-fingered manipulation tasks.
AbstractList	We introduce MORPH, a method for co-optimization of hardware design parameters and control policies in simulation using reinforcement learning. Like most co-optimization methods, MORPH relies on a model of the hardware being optimized, usually simulated based on the laws of physics. However, such a model is often difficult to integrate into an effective optimization routine. To address this, we introduce a proxy hardware model, which is always differentiable and enables efficient co-optimization alongside a long-horizon control policy using RL. MORPH is designed to ensure that the optimized hardware proxy remains as close as possible to its realistic counterpart, while still enabling task completion. We demonstrate our approach on simulated 2D reaching and 3D multi-fingered manipulation tasks.
Author	He, Zhanpeng Ciocarlie, Matei
Author_xml	– sequence: 1 givenname: Zhanpeng surname: He fullname: He, Zhanpeng email: zhanpeng@cs.columbia.edu organization: Columbia University,Department of Computer Science,New York,USA – sequence: 2 givenname: Matei surname: Ciocarlie fullname: Ciocarlie, Matei email: matei.ciocarlie@columbia.edu organization: Columbia University,Department of Mechanical Engineering,New York,USA
BookMark	eNqFzruKwkAUgOERtFgvb7Cw5wXMzuSyY-wkKhEMSrAPx3iiB5IZGYOXfXq3cGurv_iavy-6xhoS4ktJTykZf6-SfBZpFWrPl37oKfmjpA78jhjFOp4EkQwmYaTDD1Fmm3ybTmFOFz4aSOzYnltu-BdbtgZu3J4gJzaVdSU1ZFpYEzrD5ghXRkCYc1WR-wPGfU2Qojvc0BFk9kA1bJ29P4aiV2F9odGrA_G5XOySdMxEVJwdN-gexf9j8IafoElFog
ContentType	Conference Proceeding
DBID	6IE 6IH CBEJK RIE RIO
DOI	10.1109/ICRA57147.2024.10610732
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE/IET Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Physics
EISBN	9798350384574
EndPage	7771
ExternalDocumentID	10610732
Genre	orig-research
GroupedDBID	6IE 6IH CBEJK RIE RIO
ID	FETCH-ieee_primary_106107323
IEDL.DBID	RIE
IngestDate	Wed Aug 14 05:40:32 EDT 2024
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-ieee_primary_106107323
ParticipantIDs	ieee_primary_10610732
PublicationCentury	2000
PublicationDate	2024-May-13
PublicationDateYYYYMMDD	2024-05-13
PublicationDate_xml	– month: 05 year: 2024 text: 2024-May-13 day: 13
PublicationDecade	2020
PublicationTitle	2024 IEEE International Conference on Robotics and Automation (ICRA)
PublicationTitleAbbrev	ICRA
PublicationYear	2024
Publisher	IEEE
Publisher_xml	– name: IEEE
Score	3.8393795
Snippet	We introduce MORPH, a method for co-optimization of hardware design parameters and control policies in simulation using reinforcement learning. Like most...
SourceID	ieee
SourceType	Publisher
StartPage	7764
SubjectTerms	Hardware Optimization Physics Reinforcement learning Robotics and automation Task analysis Three-dimensional displays
Title	MORPH: Design Co-optimization with Reinforcement Learning via a Differentiable Hardware Model Proxy
URI	https://ieeexplore.ieee.org/document/10610732
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1bS8MwFD7oQPDJW8XLlDz42q6XtMl8k85Rhc1RFPY2mstEnK3M1tuvN0lbRVHwLZTQHJo0h3PO930H4MTvi4BRQW0SEqICFM5spsJl25t7VOCQEWnU-UfjKLnBl9Nw2pDVDRdGSmnAZ9LRQ1PLFwWvdKqsp8MXdSTVjbtKXb8mazWYLc_t9y7i9CwkHiYq7POx087-1jfFuI3hBozbBWu0yL1Tlczh7z-0GP9t0SZYXww9NPn0PVuwIvNtWDNwTv60A3x0lU6SUzQw-AwUF3ahroaHhnOJdPIVpdKIpnKTH0SNzuoter7LUIYGTd8U9f-zhUS6vv-SLSXSrdMWeuXXNwu6w_PrOLG1vbPHWrVi1poa7EInL3K5B0i7KqE7FUlXYM5DRv0AR9TP5jziQkT7YP36ioM_nh_Cuv7yurjuBV3olMtKHimfXbJjs1cfQDiblg
link.rule.ids	310,311,786,790,795,796,802,27958,55109
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8NAEB6kInryFfFRdQ9ekzbv1JukllSbWEKF3kL20SLWRGri69e7u0kURcFbNoTdgQ0zzMz3fQNwZvSoiT3qqa7tujxBIVjFPF1W9ZnuUcvGLpPq_GHkBLfW1dSe1mR1yYVhjEnwGdPEo-zl05yUolTWEekL_yW5x13lgb7bq-haNWqLrztDP76wXd1yeeJnWFrz_bfJKTJwDDYhao6s8CL3Wllgjbz_UGP8t01boHxx9ND4M_pswwrLdmBNAjrJ0y6Q8CYeB-eoLxEayM_VnDuHh5p1iUT5FcVMyqYSWSFEtdLqHD3fpShF_XpyCvcAeMGQ6PC_pEuGxPC0hTj59U2B9uBy4geqsDd5rHQrksZUcw9aWZ6xfUAiWFExq4h1qUWIjT3DtBzPSGfEIZQ6B6D8usXhH-9PYT2YhKNkNIyuj2BD3IJotetmG1rFsmTHPIIX-ETe2weiEZ7s
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2024+IEEE+International+Conference+on+Robotics+and+Automation+%28ICRA%29&rft.atitle=MORPH%3A+Design+Co-optimization+with+Reinforcement+Learning+via+a+Differentiable+Hardware+Model+Proxy&rft.au=He%2C+Zhanpeng&rft.au=Ciocarlie%2C+Matei&rft.date=2024-05-13&rft.pub=IEEE&rft.spage=7764&rft.epage=7771&rft_id=info:doi/10.1109%2FICRA57147.2024.10610732&rft.externalDocID=10610732