3D ShapeNets: A deep representation for volumetric shapes

3D shape is a crucial but heavily underutilized cue in today's computer vision systems, mostly due to the lack of a good generic shape representation. With the recent availability of inexpensive 2.5D depth sensors (e.g. Microsoft Kinect), it is becoming increasingly important to have a powerful...

Full description

Saved in:
Bibliographic Details
Published in2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 1912 - 1920
Main Authors Zhirong Wu, Song, Shuran, Khosla, Aditya, Fisher Yu, Linguang Zhang, Xiaoou Tang, Xiao, Jianxiong
Format Conference Proceeding Journal Article
LanguageEnglish
Published IEEE 01.06.2015
Subjects
Online AccessGet full text
ISSN1063-6919
1063-6919
2575-7075
DOI10.1109/CVPR.2015.7298801

Cover

Loading…
Abstract 3D shape is a crucial but heavily underutilized cue in today's computer vision systems, mostly due to the lack of a good generic shape representation. With the recent availability of inexpensive 2.5D depth sensors (e.g. Microsoft Kinect), it is becoming increasingly important to have a powerful 3D shape representation in the loop. Apart from category recognition, recovering full 3D shapes from view-based 2.5D depth maps is also a critical part of visual understanding. To this end, we propose to represent a geometric 3D shape as a probability distribution of binary variables on a 3D voxel grid, using a Convolutional Deep Belief Network. Our model, 3D ShapeNets, learns the distribution of complex 3D shapes across different object categories and arbitrary poses from raw CAD data, and discovers hierarchical compositional part representation automatically. It naturally supports joint object recognition and shape completion from 2.5D depth maps, and it enables active object recognition through view planning. To train our 3D deep learning model, we construct ModelNet - a large-scale 3D CAD model dataset. Extensive experiments show that our 3D deep representation enables significant performance improvement over the-state-of-the-arts in a variety of tasks.
AbstractList 3D shape is a crucial but heavily underutilized cue in today's computer vision systems, mostly due to the lack of a good generic shape representation. With the recent availability of inexpensive 2.5D depth sensors (e.g. Microsoft Kinect), it is becoming increasingly important to have a powerful 3D shape representation in the loop. Apart from category recognition, recovering full 3D shapes from view-based 2.5D depth maps is also a critical part of visual understanding. To this end, we propose to represent a geometric 3D shape as a probability distribution of binary variables on a 3D voxel grid, using a Convolutional Deep Belief Network. Our model, 3D ShapeNets, learns the distribution of complex 3D shapes across different object categories and arbitrary poses from raw CAD data, and discovers hierarchical compositional part representation automatically. It naturally supports joint object recognition and shape completion from 2.5D depth maps, and it enables active object recognition through view planning. To train our 3D deep learning model, we construct ModelNet - a large-scale 3D CAD model dataset. Extensive experiments show that our 3D deep representation enables significant performance improvement over the-state-of-the-arts in a variety of tasks.
Author Xiaoou Tang
Zhirong Wu
Xiao, Jianxiong
Linguang Zhang
Khosla, Aditya
Fisher Yu
Song, Shuran
Author_xml – sequence: 1
  surname: Zhirong Wu
  fullname: Zhirong Wu
  organization: Princeton University, USA
– sequence: 2
  givenname: Shuran
  surname: Song
  fullname: Song, Shuran
  organization: Princeton University, USA
– sequence: 3
  givenname: Aditya
  surname: Khosla
  fullname: Khosla, Aditya
  organization: Massachusetts Institute of Technology, USA
– sequence: 4
  surname: Fisher Yu
  fullname: Fisher Yu
  organization: Princeton University, USA
– sequence: 5
  surname: Linguang Zhang
  fullname: Linguang Zhang
  organization: Princeton University, USA
– sequence: 6
  surname: Xiaoou Tang
  fullname: Xiaoou Tang
  organization: Chinese University of Hong Kong, China
– sequence: 7
  givenname: Jianxiong
  surname: Xiao
  fullname: Xiao, Jianxiong
  organization: Princeton University, USA
BookMark eNpNkEtLw0AUhUepYFv7A8RNlm5S751J5uGutL6gqPjahsnkBgfSJM6kgv_eSrtwdc7i4-NwJmzUdi0xdo4wRwRztfx4fplzwHyuuNEa8IhNMJNKSCMzOGZjBClSadCM_vVTNovRlyAAtDEcxsyIVfL6aXt6pCFeJ4ukIuqTQH2gSO1gB9-1Sd2F5LtrthsagndJ_OPjGTupbRNpdsgpe7-9eVvep-unu4flYp16nuOQckRT51Bqy0tHSGi0cfVuuJWouHOmcrUulZauNiKvapsJKSRXFbfaOnRiyi733j50X1uKQ7Hx0VHT2Ja6bSxQKRA8A4AderFHPREVffAbG36Kw0HiFy3qWFg
ContentType Conference Proceeding
Journal Article
DBID 6IE
6IH
CBEJK
RIE
RIO
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/CVPR.2015.7298801
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Computer and Information Systems Abstracts
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
Computer Science
EISBN 1467369640
9781467369640
EISSN 1063-6919
2575-7075
EndPage 1920
ExternalDocumentID 7298801
Genre orig-research
GroupedDBID 23M
29F
29O
6IE
6IH
6IK
ABDPE
ACGFS
ALMA_UNASSIGNED_HOLDINGS
CBEJK
IPLJI
M43
RIE
RIO
RNS
7SC
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-i251t-2119f50b8a2bce1e1989cf015a6172cc9dcf8b786cf935dfa4363627d2a8ac1c3
IEDL.DBID RIE
ISSN 1063-6919
IngestDate Fri Jul 11 07:55:16 EDT 2025
Wed Aug 27 02:26:58 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i251t-2119f50b8a2bce1e1989cf015a6172cc9dcf8b786cf935dfa4363627d2a8ac1c3
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Conference-1
ObjectType-Feature-3
content type line 23
SourceType-Conference Papers & Proceedings-2
OpenAccessLink https://doi.org/10.1109/CVPR.2015.7298801
PQID 1770324000
PQPubID 23500
PageCount 9
ParticipantIDs proquest_miscellaneous_1770324000
ieee_primary_7298801
PublicationCentury 2000
PublicationDate 20150601
PublicationDateYYYYMMDD 2015-06-01
PublicationDate_xml – month: 06
  year: 2015
  text: 20150601
  day: 01
PublicationDecade 2010
PublicationTitle 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
PublicationTitleAbbrev CVPR
PublicationYear 2015
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib030089920
ssj0023720
ssj0003211698
Score 2.5715232
Snippet 3D shape is a crucial but heavily underutilized cue in today's computer vision systems, mostly due to the lack of a good generic shape representation. With the...
SourceID proquest
ieee
SourceType Aggregation Database
Publisher
StartPage 1912
SubjectTerms Categories
Computational modeling
Computer vision
Convolution
Object recognition
Pattern recognition
Planning
Representations
Shape
Solid modeling
Three dimensional
Three dimensional models
Three-dimensional displays
Title 3D ShapeNets: A deep representation for volumetric shapes
URI https://ieeexplore.ieee.org/document/7298801
https://www.proquest.com/docview/1770324000
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8NAEF6qJ09VW7G-WMGjqUk22c16Ex8UoaWoFW9lH7MoQlpMevHXu5NHBfXgLSwJSSaTmW92vpkh5EyAibmJssAxGwYI4AMpXBKwxCrBtRCpxX3I8YSPZsn9S_rSIefrWhgAqMhnMMTDKpdvF2aFW2UXHgh6dfOxzoYP3OparVZ3WIj5qwb6oBVmPrLhcp1RiHEaS5X55CzgMpJNhjMK5cX18_QBSV7psLlBM2nll3mufM5dl4zbp62pJu_DVamH5vNHI8f_vs426X9X99Hp2m_tkA7ku6TbwFHa_OyFX2onPrRrPSLZDX18VUuYQFlc0itqAZa0aozZFjHl1MNgWhs97P5PCzy_6JPZ3e3T9Shohi8Ebx7ylAF2fnNpqDMVawMRILfKOC8khZjHGGmNy7TIuHGSpdaphHHvDIWNVaZMZNge2cwXOewTKkOITCRUpjkkXDvtYumcrbuNhVoOSA-FM1_W_TXmjVwG5LQV_9zrPCYyVA6LVTGPhLdTSH4ND_6-9JBs4fesKV1HZLP8WMGxBw-lPqm05gtw3L5m
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3JTuQwELUQHIYTwyZgWDwSR9LEcWLH3BCLmqVbiE3cIi9lgZDSLZK-zNePK0sjwRzmFlmJnFTKVc-uqleEHEqwibAsjzx3cYQAPlLSpxFPnZbCSJk5PIccjcXwKb1-yV4WyNG8FgYAmuQzGOBlE8t3EzvDo7LjAASDuoW9zlLw-xlrq7V67eExRrA68IN2mIe9jVDzmEKC_Via2KfgkVBMdTFOFqvjs-e7e0zzygbdFF2vlW8GuvE6lytk1L9vm2zyPpjVZmD_fKFy_N8P-kk2Puv76N3cc62SBSjXyEoHSGm33Ksw1Pd86MfWieLn9OFVT2EMdXVCT6kDmNKGGrMvYyppAMK0NXvI_08rvL_aIE-XF49nw6hrvxC9BdBTR8j95rPY5DoxFhhgdpX1QUgaUY-1ylmfG5kL6xXPnNcpF8EdSpfoXFtm-SZZLCclbBGqYmCWSZ0bAakw3vhEee9avrHYqG2yjsIppi3DRtHJZZv87sVfBK3HUIYuYTKrCiaDpcL013jn348ekB_Dx9FtcXs1vvlFlvHftgleu2Sx_pjBXoAStdlvNOgvkObBrw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=2015+IEEE+Conference+on+Computer+Vision+and+Pattern+Recognition+%28CVPR%29&rft.atitle=3D+ShapeNets%3A+A+deep+representation+for+volumetric+shapes&rft.au=Zhirong+Wu&rft.au=Song%2C+Shuran&rft.au=Khosla%2C+Aditya&rft.au=Fisher+Yu&rft.date=2015-06-01&rft.pub=IEEE&rft.issn=1063-6919&rft.eissn=1063-6919&rft.spage=1912&rft.epage=1920&rft_id=info:doi/10.1109%2FCVPR.2015.7298801&rft.externalDocID=7298801
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-6919&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-6919&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-6919&client=summon