Ditto: Building Digital Twins of Articulated Objects from Interaction

Digitizing physical objects into the virtual world has the potential to unlock new research and applications in embodied AI and mixed reality. This work focuses on recreating interactive digital twins of real-world articulated objects, which can be directly imported into virtual environments. We int...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) pp. 5606 - 5616
Main Authors	Jiang, Zhenyu, Hsu, Cheng-Chun, Zhu, Yuke
Format	Conference Proceeding
Language	English
Published	IEEE 01.06.2022
Subjects	3D from multi-view and sensors; Physics-based vision and shape-from-X; Representation learning Buildings Estimation Geometry Solid modeling Three-dimensional displays Virtual environments Visualization
Online Access	Get full text

Cover

Loading…

Abstract	Digitizing physical objects into the virtual world has the potential to unlock new research and applications in embodied AI and mixed reality. This work focuses on recreating interactive digital twins of real-world articulated objects, which can be directly imported into virtual environments. We introduce Ditto to learn articulation model estimation and 3D geometry reconstruction of an articulated object through interactive perception. Given a pair of visual observations of an articulated object before and after interaction, Ditto reconstructs part-level geometry and estimates the articulation model of the object. We employ implicit neural representations for joint geometry and articulation modeling. Our experiments show that Ditto effectively builds digital twins of articulated objects in a category-agnostic way. We also apply Ditto to real-world objects and deploy the recreated digital twins in physical simulation. Code and additional results are available at https://ut-austin-rpl.github.io/Ditto/
AbstractList	Digitizing physical objects into the virtual world has the potential to unlock new research and applications in embodied AI and mixed reality. This work focuses on recreating interactive digital twins of real-world articulated objects, which can be directly imported into virtual environments. We introduce Ditto to learn articulation model estimation and 3D geometry reconstruction of an articulated object through interactive perception. Given a pair of visual observations of an articulated object before and after interaction, Ditto reconstructs part-level geometry and estimates the articulation model of the object. We employ implicit neural representations for joint geometry and articulation modeling. Our experiments show that Ditto effectively builds digital twins of articulated objects in a category-agnostic way. We also apply Ditto to real-world objects and deploy the recreated digital twins in physical simulation. Code and additional results are available at https://ut-austin-rpl.github.io/Ditto/
Author	Zhu, Yuke Jiang, Zhenyu Hsu, Cheng-Chun
Author_xml	– sequence: 1 givenname: Zhenyu surname: Jiang fullname: Jiang, Zhenyu organization: The University of Texas at Austin,Department of Computer Science – sequence: 2 givenname: Cheng-Chun surname: Hsu fullname: Hsu, Cheng-Chun organization: The University of Texas at Austin,Department of Computer Science – sequence: 3 givenname: Yuke surname: Zhu fullname: Zhu, Yuke organization: The University of Texas at Austin,Department of Computer Science
BookMark	eNotzNFOwjAUgOFqNBGQJ9CLvsCw7Wm71jscqCQkGIPekrY7JSVjM1uJ4e010av_6vvH5KrtWiTknrMZ58w-VJ9v70poY2aCCTFjTCm4IGOutZLaSg2XZMSZhkJbbm_IdBgOjDEQnGtrRmS5SDl3j_TplJo6tXu6SPuUXUO336kdaBfpvM8pnBqXsaYbf8CQBxr77khXbcbehZy69pZcR9cMOP3vhHw8L7fVa7HevKyq-bpIgkEuQCpQEuvomSm14gg-eGnQhwhgYhQmylqJUkRuo6_Bo-HWCWW9jyCDhQm5-_smRNx99eno-vPOmtL-IvgBCsRODw
CODEN	IEEPAD
ContentType	Conference Proceeding
DBID	6IE 6IH CBEJK RIE RIO
DOI	10.1109/CVPR52688.2022.00553
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Applied Sciences
EISBN	1665469463 9781665469463
EISSN	1063-6919
EndPage	5616
ExternalDocumentID	9879272
Genre	orig-research
GrantInformation_xml	– fundername: NSF grantid: CNS-1955523 funderid: 10.13039/100000001
GroupedDBID	6IE 6IH 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP OCL RIE RIL RIO
ID	FETCH-LOGICAL-i203t-345354edfb087651e3bcb48ebcf338ff28f4d5272f19fbd3be819a259bbf34c93
IEDL.DBID	RIE
IngestDate	Wed Aug 27 02:15:10 EDT 2025
IsPeerReviewed	false
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i203t-345354edfb087651e3bcb48ebcf338ff28f4d5272f19fbd3be819a259bbf34c93
PageCount	11
ParticipantIDs	ieee_primary_9879272
PublicationCentury	2000
PublicationDate	2022-June
PublicationDateYYYYMMDD	2022-06-01
PublicationDate_xml	– month: 06 year: 2022 text: 2022-June
PublicationDecade	2020
PublicationTitle	Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online)
PublicationTitleAbbrev	CVPR
PublicationYear	2022
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0003211698
Score	2.5156817
Snippet	Digitizing physical objects into the virtual world has the potential to unlock new research and applications in embodied AI and mixed reality. This work...
SourceID	ieee
SourceType	Publisher
StartPage	5606
SubjectTerms	3D from multi-view and sensors; Physics-based vision and shape-from-X; Representation learning Buildings Estimation Geometry Solid modeling Three-dimensional displays Virtual environments Visualization
Title	Ditto: Building Digital Twins of Articulated Objects from Interaction
URI	https://ieeexplore.ieee.org/document/9879272
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFA5zJ09TN_E3OXg0W9ckbeNN58YQpkM22W00v6QIrWwdgn-9L2mdIh68lVwakibvfa_v-z6ELuEj4UlANYkgdyBMJIokLI2IU9GRYRwJ7e2AJg_ReM7uF3zRQFdbLowxxjefma579P_ydaE2rlTWA3wswhgu3B0AbhVXa1tPoYBkIpHU7Lh-IHqD5-mTEzNxDVyhk-XkzgH5h4eKDyGjFpp8vbzqHHntbkrZVR-_dBn_O7s91Pkm6-HpNgzto4bJD1Crzi5xfXbXbTS8y8qyuMa3tRE2vstenGMInr1n-RoXFt-svBIHZJ8aP0pXoFljRz_BvmxYMSA6aD4azgZjUpsokCwMaEko45Qzo6104nO8b6hUkiVGKgvo1NowsUxzmLftCys1lQZyhBRAkZSWMiXoIWrmRW6OEKaxgfQoSOFgabhfVRrFTnxeUi2Ukjo8Rm23Ksu3SidjWS_Iyd_Dp2jX7UvVdnWGmuVqY84hwJfywu_sJzP9pJ4
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwED1VZYCpQIv4xgMjaZPYTmI26IcKtKVCLepW1V8oQkpQmwqJX4-dhIIQA1vkJZYd-95d7r0HcGk-Ehq5WDqBwQ4OYZFwIrIIHKuiw_0wYDK3AxqOgv6U3M_orAJXGy6MUipvPlNN-5j_y5epWNtSWcvkx8wPzYW7ZeI-9Qq21qaigk0uE7Co5Md5Lmu1n8dPVs7EtnD5VpiTWg_kHy4qeRDp1WD49fqid-S1uc54U3z8Umb87_x2ofFN10PjTSDag4pK9qFW4ktUnt5VHbqdOMvSa3RbWmGjTvxiPUPQ5D1OVijV6GaZa3EY_CnRI7clmhWyBBSUFw4LDkQDpr3upN13ShsFJ_ZdnDmYUEyJkppb-TnqKcwFJ5HiQpv8VGs_0kRSM2_tMc0l5sqghIVJizjXmAiGD6CapIk6BIRDZQCSuzBHS5obViyC0MrPcyyZEFz6R1C3qzJ_K5Qy5uWCHP89fAHb_clwMB_cjR5OYMfuUdGEdQrVbLlWZybcZ_w83-VP-sOn5w
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=Ditto%3A+Building+Digital+Twins+of+Articulated+Objects+from+Interaction&rft.au=Jiang%2C+Zhenyu&rft.au=Hsu%2C+Cheng-Chun&rft.au=Zhu%2C+Yuke&rft.date=2022-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=5606&rft.epage=5616&rft_id=info:doi/10.1109%2FCVPR52688.2022.00553&rft.externalDocID=9879272