Ditto: Building Digital Twins of Articulated Objects from Interaction

Digitizing physical objects into the virtual world has the potential to unlock new research and applications in embodied AI and mixed reality. This work focuses on recreating interactive digital twins of real-world articulated objects, which can be directly imported into virtual environments. We int...

Full description

Saved in:
Bibliographic Details
Published inProceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) pp. 5606 - 5616
Main Authors Jiang, Zhenyu, Hsu, Cheng-Chun, Zhu, Yuke
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.06.2022
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Digitizing physical objects into the virtual world has the potential to unlock new research and applications in embodied AI and mixed reality. This work focuses on recreating interactive digital twins of real-world articulated objects, which can be directly imported into virtual environments. We introduce Ditto to learn articulation model estimation and 3D geometry reconstruction of an articulated object through interactive perception. Given a pair of visual observations of an articulated object before and after interaction, Ditto reconstructs part-level geometry and estimates the articulation model of the object. We employ implicit neural representations for joint geometry and articulation modeling. Our experiments show that Ditto effectively builds digital twins of articulated objects in a category-agnostic way. We also apply Ditto to real-world objects and deploy the recreated digital twins in physical simulation. Code and additional results are available at https://ut-austin-rpl.github.io/Ditto/
AbstractList Digitizing physical objects into the virtual world has the potential to unlock new research and applications in embodied AI and mixed reality. This work focuses on recreating interactive digital twins of real-world articulated objects, which can be directly imported into virtual environments. We introduce Ditto to learn articulation model estimation and 3D geometry reconstruction of an articulated object through interactive perception. Given a pair of visual observations of an articulated object before and after interaction, Ditto reconstructs part-level geometry and estimates the articulation model of the object. We employ implicit neural representations for joint geometry and articulation modeling. Our experiments show that Ditto effectively builds digital twins of articulated objects in a category-agnostic way. We also apply Ditto to real-world objects and deploy the recreated digital twins in physical simulation. Code and additional results are available at https://ut-austin-rpl.github.io/Ditto/
Author Zhu, Yuke
Jiang, Zhenyu
Hsu, Cheng-Chun
Author_xml – sequence: 1
  givenname: Zhenyu
  surname: Jiang
  fullname: Jiang, Zhenyu
  organization: The University of Texas at Austin,Department of Computer Science
– sequence: 2
  givenname: Cheng-Chun
  surname: Hsu
  fullname: Hsu, Cheng-Chun
  organization: The University of Texas at Austin,Department of Computer Science
– sequence: 3
  givenname: Yuke
  surname: Zhu
  fullname: Zhu, Yuke
  organization: The University of Texas at Austin,Department of Computer Science
BookMark eNotzNFOwjAUgOFqNBGQJ9CLvsCw7Wm71jscqCQkGIPekrY7JSVjM1uJ4e010av_6vvH5KrtWiTknrMZ58w-VJ9v70poY2aCCTFjTCm4IGOutZLaSg2XZMSZhkJbbm_IdBgOjDEQnGtrRmS5SDl3j_TplJo6tXu6SPuUXUO336kdaBfpvM8pnBqXsaYbf8CQBxr77khXbcbehZy69pZcR9cMOP3vhHw8L7fVa7HevKyq-bpIgkEuQCpQEuvomSm14gg-eGnQhwhgYhQmylqJUkRuo6_Bo-HWCWW9jyCDhQm5-_smRNx99eno-vPOmtL-IvgBCsRODw
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/CVPR52688.2022.00553
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Xplore
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
EISBN 1665469463
9781665469463
EISSN 1063-6919
EndPage 5616
ExternalDocumentID 9879272
Genre orig-research
GrantInformation_xml – fundername: NSF
  grantid: CNS-1955523
  funderid: 10.13039/100000001
GroupedDBID 6IE
6IH
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
OCL
RIE
RIL
RIO
ID FETCH-LOGICAL-i203t-345354edfb087651e3bcb48ebcf338ff28f4d5272f19fbd3be819a259bbf34c93
IEDL.DBID RIE
IngestDate Wed Aug 27 02:15:10 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i203t-345354edfb087651e3bcb48ebcf338ff28f4d5272f19fbd3be819a259bbf34c93
PageCount 11
ParticipantIDs ieee_primary_9879272
PublicationCentury 2000
PublicationDate 2022-June
PublicationDateYYYYMMDD 2022-06-01
PublicationDate_xml – month: 06
  year: 2022
  text: 2022-June
PublicationDecade 2020
PublicationTitle Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online)
PublicationTitleAbbrev CVPR
PublicationYear 2022
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0003211698
Score 2.5156817
Snippet Digitizing physical objects into the virtual world has the potential to unlock new research and applications in embodied AI and mixed reality. This work...
SourceID ieee
SourceType Publisher
StartPage 5606
SubjectTerms 3D from multi-view and sensors; Physics-based vision and shape-from-X; Representation learning
Buildings
Estimation
Geometry
Solid modeling
Three-dimensional displays
Virtual environments
Visualization
Title Ditto: Building Digital Twins of Articulated Objects from Interaction
URI https://ieeexplore.ieee.org/document/9879272
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFA5zJ09TN_E3OXg0W9ckbeNN58YQpkM22W00v6QIrWwdgn-9L2mdIh68lVwakibvfa_v-z6ELuEj4UlANYkgdyBMJIokLI2IU9GRYRwJ7e2AJg_ReM7uF3zRQFdbLowxxjefma579P_ydaE2rlTWA3wswhgu3B0AbhVXa1tPoYBkIpHU7Lh-IHqD5-mTEzNxDVyhk-XkzgH5h4eKDyGjFpp8vbzqHHntbkrZVR-_dBn_O7s91Pkm6-HpNgzto4bJD1Crzi5xfXbXbTS8y8qyuMa3tRE2vstenGMInr1n-RoXFt-svBIHZJ8aP0pXoFljRz_BvmxYMSA6aD4azgZjUpsokCwMaEko45Qzo6104nO8b6hUkiVGKgvo1NowsUxzmLftCys1lQZyhBRAkZSWMiXoIWrmRW6OEKaxgfQoSOFgabhfVRrFTnxeUi2Ukjo8Rm23Ksu3SidjWS_Iyd_Dp2jX7UvVdnWGmuVqY84hwJfywu_sJzP9pJ4
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwED1VZYCpQIv4xgMjaZPYTmI26IcKtKVCLepW1V8oQkpQmwqJX4-dhIIQA1vkJZYd-95d7r0HcGk-Ehq5WDqBwQ4OYZFwIrIIHKuiw_0wYDK3AxqOgv6U3M_orAJXGy6MUipvPlNN-5j_y5epWNtSWcvkx8wPzYW7ZeI-9Qq21qaigk0uE7Co5Md5Lmu1n8dPVs7EtnD5VpiTWg_kHy4qeRDp1WD49fqid-S1uc54U3z8Umb87_x2ofFN10PjTSDag4pK9qFW4ktUnt5VHbqdOMvSa3RbWmGjTvxiPUPQ5D1OVijV6GaZa3EY_CnRI7clmhWyBBSUFw4LDkQDpr3upN13ShsFJ_ZdnDmYUEyJkppb-TnqKcwFJ5HiQpv8VGs_0kRSM2_tMc0l5sqghIVJizjXmAiGD6CapIk6BIRDZQCSuzBHS5obViyC0MrPcyyZEFz6R1C3qzJ_K5Qy5uWCHP89fAHb_clwMB_cjR5OYMfuUdGEdQrVbLlWZybcZ_w83-VP-sOn5w
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=Ditto%3A+Building+Digital+Twins+of+Articulated+Objects+from+Interaction&rft.au=Jiang%2C+Zhenyu&rft.au=Hsu%2C+Cheng-Chun&rft.au=Zhu%2C+Yuke&rft.date=2022-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=5606&rft.epage=5616&rft_id=info:doi/10.1109%2FCVPR52688.2022.00553&rft.externalDocID=9879272