3D ShapeNets: A deep representation for volumetric shapes

3D shape is a crucial but heavily underutilized cue in today's computer vision systems, mostly due to the lack of a good generic shape representation. With the recent availability of inexpensive 2.5D depth sensors (e.g. Microsoft Kinect), it is becoming increasingly important to have a powerful...

Full description

Saved in:

Bibliographic Details
Published in	2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 1912 - 1920
Main Authors	Zhirong Wu, Song, Shuran, Khosla, Aditya, Fisher Yu, Linguang Zhang, Xiaoou Tang, Xiao, Jianxiong
Format	Conference Proceeding Journal Article
Language	English
Published	IEEE 01.06.2015
Subjects	Categories Computational modeling Computer vision Convolution Object recognition Pattern recognition Planning Representations Shape Solid modeling Three dimensional Three dimensional models Three-dimensional displays
Online Access	Get full text
ISSN	1063-6919 1063-6919 2575-7075
DOI	10.1109/CVPR.2015.7298801

Cover

Loading…

Abstract	3D shape is a crucial but heavily underutilized cue in today's computer vision systems, mostly due to the lack of a good generic shape representation. With the recent availability of inexpensive 2.5D depth sensors (e.g. Microsoft Kinect), it is becoming increasingly important to have a powerful 3D shape representation in the loop. Apart from category recognition, recovering full 3D shapes from view-based 2.5D depth maps is also a critical part of visual understanding. To this end, we propose to represent a geometric 3D shape as a probability distribution of binary variables on a 3D voxel grid, using a Convolutional Deep Belief Network. Our model, 3D ShapeNets, learns the distribution of complex 3D shapes across different object categories and arbitrary poses from raw CAD data, and discovers hierarchical compositional part representation automatically. It naturally supports joint object recognition and shape completion from 2.5D depth maps, and it enables active object recognition through view planning. To train our 3D deep learning model, we construct ModelNet - a large-scale 3D CAD model dataset. Extensive experiments show that our 3D deep representation enables significant performance improvement over the-state-of-the-arts in a variety of tasks.
AbstractList	3D shape is a crucial but heavily underutilized cue in today's computer vision systems, mostly due to the lack of a good generic shape representation. With the recent availability of inexpensive 2.5D depth sensors (e.g. Microsoft Kinect), it is becoming increasingly important to have a powerful 3D shape representation in the loop. Apart from category recognition, recovering full 3D shapes from view-based 2.5D depth maps is also a critical part of visual understanding. To this end, we propose to represent a geometric 3D shape as a probability distribution of binary variables on a 3D voxel grid, using a Convolutional Deep Belief Network. Our model, 3D ShapeNets, learns the distribution of complex 3D shapes across different object categories and arbitrary poses from raw CAD data, and discovers hierarchical compositional part representation automatically. It naturally supports joint object recognition and shape completion from 2.5D depth maps, and it enables active object recognition through view planning. To train our 3D deep learning model, we construct ModelNet - a large-scale 3D CAD model dataset. Extensive experiments show that our 3D deep representation enables significant performance improvement over the-state-of-the-arts in a variety of tasks.
Author	Xiaoou Tang Zhirong Wu Xiao, Jianxiong Linguang Zhang Khosla, Aditya Fisher Yu Song, Shuran
Author_xml	– sequence: 1 surname: Zhirong Wu fullname: Zhirong Wu organization: Princeton University, USA – sequence: 2 givenname: Shuran surname: Song fullname: Song, Shuran organization: Princeton University, USA – sequence: 3 givenname: Aditya surname: Khosla fullname: Khosla, Aditya organization: Massachusetts Institute of Technology, USA – sequence: 4 surname: Fisher Yu fullname: Fisher Yu organization: Princeton University, USA – sequence: 5 surname: Linguang Zhang fullname: Linguang Zhang organization: Princeton University, USA – sequence: 6 surname: Xiaoou Tang fullname: Xiaoou Tang organization: Chinese University of Hong Kong, China – sequence: 7 givenname: Jianxiong surname: Xiao fullname: Xiao, Jianxiong organization: Princeton University, USA
BookMark	eNpNkEtLw0AUhUepYFv7A8RNlm5S751J5uGutL6gqPjahsnkBgfSJM6kgv_eSrtwdc7i4-NwJmzUdi0xdo4wRwRztfx4fplzwHyuuNEa8IhNMJNKSCMzOGZjBClSadCM_vVTNovRlyAAtDEcxsyIVfL6aXt6pCFeJ4ukIuqTQH2gSO1gB9-1Sd2F5LtrthsagndJ_OPjGTupbRNpdsgpe7-9eVvep-unu4flYp16nuOQckRT51Bqy0tHSGi0cfVuuJWouHOmcrUulZauNiKvapsJKSRXFbfaOnRiyi733j50X1uKQ7Hx0VHT2Ja6bSxQKRA8A4AderFHPREVffAbG36Kw0HiFy3qWFg
ContentType	Conference Proceeding Journal Article
DBID	6IE 6IH CBEJK RIE RIO 7SC 8FD JQ2 L7M L~C L~D
DOI	10.1109/CVPR.2015.7298801
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional
DatabaseTitle	Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional
DatabaseTitleList	Computer and Information Systems Abstracts
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Applied Sciences Computer Science
EISBN	1467369640 9781467369640
EISSN	1063-6919 2575-7075
EndPage	1920
ExternalDocumentID	7298801
Genre	orig-research
GroupedDBID	23M 29F 29O 6IE 6IH 6IK ABDPE ACGFS ALMA_UNASSIGNED_HOLDINGS CBEJK IPLJI M43 RIE RIO RNS 7SC 8FD JQ2 L7M L~C L~D
ID	FETCH-LOGICAL-i251t-2119f50b8a2bce1e1989cf015a6172cc9dcf8b786cf935dfa4363627d2a8ac1c3
IEDL.DBID	RIE
ISSN	1063-6919
IngestDate	Fri Jul 11 07:55:16 EDT 2025 Wed Aug 27 02:26:58 EDT 2025
IsDoiOpenAccess	false
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i251t-2119f50b8a2bce1e1989cf015a6172cc9dcf8b786cf935dfa4363627d2a8ac1c3
Notes	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Conference-1 ObjectType-Feature-3 content type line 23 SourceType-Conference Papers & Proceedings-2
OpenAccessLink	https://doi.org/10.1109/CVPR.2015.7298801
PQID	1770324000
PQPubID	23500
PageCount	9
ParticipantIDs	proquest_miscellaneous_1770324000 ieee_primary_7298801
PublicationCentury	2000
PublicationDate	20150601
PublicationDateYYYYMMDD	2015-06-01
PublicationDate_xml	– month: 06 year: 2015 text: 20150601 day: 01
PublicationDecade	2010
PublicationTitle	2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
PublicationTitleAbbrev	CVPR
PublicationYear	2015
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssib030089920 ssj0023720 ssj0003211698
Score	2.5715232
Snippet	3D shape is a crucial but heavily underutilized cue in today's computer vision systems, mostly due to the lack of a good generic shape representation. With the...
SourceID	proquest ieee
SourceType	Aggregation Database Publisher
StartPage	1912
SubjectTerms	Categories Computational modeling Computer vision Convolution Object recognition Pattern recognition Planning Representations Shape Solid modeling Three dimensional Three dimensional models Three-dimensional displays
Title	3D ShapeNets: A deep representation for volumetric shapes
URI	https://ieeexplore.ieee.org/document/7298801 https://www.proquest.com/docview/1770324000
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8NAEF6qJ09VW7G-WMGjqUk22c16Ex8UoaWoFW9lH7MoQlpMevHXu5NHBfXgLSwJSSaTmW92vpkh5EyAibmJssAxGwYI4AMpXBKwxCrBtRCpxX3I8YSPZsn9S_rSIefrWhgAqMhnMMTDKpdvF2aFW2UXHgh6dfOxzoYP3OparVZ3WIj5qwb6oBVmPrLhcp1RiHEaS5X55CzgMpJNhjMK5cX18_QBSV7psLlBM2nll3mufM5dl4zbp62pJu_DVamH5vNHI8f_vs426X9X99Hp2m_tkA7ku6TbwFHa_OyFX2onPrRrPSLZDX18VUuYQFlc0itqAZa0aozZFjHl1MNgWhs97P5PCzy_6JPZ3e3T9Shohi8Ebx7ylAF2fnNpqDMVawMRILfKOC8khZjHGGmNy7TIuHGSpdaphHHvDIWNVaZMZNge2cwXOewTKkOITCRUpjkkXDvtYumcrbuNhVoOSA-FM1_W_TXmjVwG5LQV_9zrPCYyVA6LVTGPhLdTSH4ND_6-9JBs4fesKV1HZLP8WMGxBw-lPqm05gtw3L5m
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3JTuQwELUQHIYTwyZgWDwSR9LEcWLH3BCLmqVbiE3cIi9lgZDSLZK-zNePK0sjwRzmFlmJnFTKVc-uqleEHEqwibAsjzx3cYQAPlLSpxFPnZbCSJk5PIccjcXwKb1-yV4WyNG8FgYAmuQzGOBlE8t3EzvDo7LjAASDuoW9zlLw-xlrq7V67eExRrA68IN2mIe9jVDzmEKC_Via2KfgkVBMdTFOFqvjs-e7e0zzygbdFF2vlW8GuvE6lytk1L9vm2zyPpjVZmD_fKFy_N8P-kk2Puv76N3cc62SBSjXyEoHSGm33Ksw1Pd86MfWieLn9OFVT2EMdXVCT6kDmNKGGrMvYyppAMK0NXvI_08rvL_aIE-XF49nw6hrvxC9BdBTR8j95rPY5DoxFhhgdpX1QUgaUY-1ylmfG5kL6xXPnNcpF8EdSpfoXFtm-SZZLCclbBGqYmCWSZ0bAakw3vhEee9avrHYqG2yjsIppi3DRtHJZZv87sVfBK3HUIYuYTKrCiaDpcL013jn348ekB_Dx9FtcXs1vvlFlvHftgleu2Sx_pjBXoAStdlvNOgvkObBrw
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=2015+IEEE+Conference+on+Computer+Vision+and+Pattern+Recognition+%28CVPR%29&rft.atitle=3D+ShapeNets%3A+A+deep+representation+for+volumetric+shapes&rft.au=Zhirong+Wu&rft.au=Song%2C+Shuran&rft.au=Khosla%2C+Aditya&rft.au=Fisher+Yu&rft.date=2015-06-01&rft.pub=IEEE&rft.issn=1063-6919&rft.eissn=1063-6919&rft.spage=1912&rft.epage=1920&rft_id=info:doi/10.1109%2FCVPR.2015.7298801&rft.externalDocID=7298801
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-6919&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-6919&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-6919&client=summon