Category‐Specific Salient View Selection via Deep Convolutional Neural Networks

In this paper, we present a new framework to determine up front orientations and detect salient views of 3D models. The salient viewpoint to human preferences is the most informative projection with correct upright orientation. Our method utilizes two Convolutional Neural Network (CNN) architectures...

Full description

Saved in:

Bibliographic Details
Published in	Computer graphics forum Vol. 36; no. 8; pp. 313 - 328
Main Authors	Kim, Seong‐heum, Tai, Yu‐Wing, Lee, Joon‐Young, Park, Jaesik, Kweon, In So
Format	Journal Article
Language	English
Published	Oxford Blackwell Publishing Ltd 01.12.2017
Subjects	Artificial neural networks best view selection Categories and Subject Descriptors (according to ACM CCS): I.3.3 [Computer Graphics]: Picture/Image Generation—Display algorithms deep learning Neural networks Three dimensional models Thumbnail icons Two dimensional models upright orientation estimation Viewing algorithms I.5.1 [Pattern Recognition]: Models—Neural Nets
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, we present a new framework to determine up front orientations and detect salient views of 3D models. The salient viewpoint to human preferences is the most informative projection with correct upright orientation. Our method utilizes two Convolutional Neural Network (CNN) architectures to encode category‐specific information learnt from a large number of 3D shapes and 2D images on the web. Using the first CNN model with 3D voxel data, we generate a CNN shape feature to decide natural upright orientation of 3D objects. Once a 3D model is upright‐aligned, the front projection and salient views are scored by category recognition using the second CNN model. The second CNN is trained over popular photo collections from internet users. In order to model comfortable viewing angles of 3D models, a category‐dependent prior is also learnt from the users. Our approach effectively combines category‐specific scores and classical evaluations to produce a data‐driven viewpoint saliency map. The best viewpoints from the method are quantitatively and qualitatively validated with more than 100 objects from 20 categories. Our thumbnail images of 3D models are the most favoured among those from different approaches. In this paper, we present a new framework to determine up front orientations and detect salient views of 3D models. The salient viewpoint to human preferences is the most informative projection with correct upright orientation. Our method utilizes two Convolutional Neural Network (CNN) architectures to encode category‐specific information learnt from a large number of 3D shapes and 2D images on the web. Using the first CNN model with 3D voxel data, we generate a CNN shape feature to decide natural upright orientation of 3D objects. Once a 3D model is upright‐aligned, the front projection and salient views are scored by category recognition using the second CNN model. The second CNN is trained over popular photo collections from internet users. In order to model comfortable viewing angles of 3D models, a category dependent prior is also learnt from the users. Our approach effectively combines category‐specific scores and classical evaluations to produce a data‐driven viewpoint saliency map. The best viewpoints from the method are quantitatively and qualitatively validated with more than 100 objects from 20 categories. Our thumbnail images of 3D models are the most favored among those from different approaches.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0167-7055 1467-8659
DOI:	10.1111/cgf.13082