Are Cars Just 3D Boxes? Jointly Estimating the 3D Shape of Multiple Objects

Current systems for scene understanding typically represent objects as 2D or 3D bounding boxes. While these representations have proven robust in a variety of applications, they provide only coarse approximations to the true 2D and 3D extent of objects. As a result, object-object interactions, such...

Full description

Saved in:
Bibliographic Details
Published in2014 IEEE Conference on Computer Vision and Pattern Recognition pp. 3678 - 3685
Main Authors Zia, M. Zeeshan, Stark, Michael, Schindler, Konrad
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.06.2014
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Current systems for scene understanding typically represent objects as 2D or 3D bounding boxes. While these representations have proven robust in a variety of applications, they provide only coarse approximations to the true 2D and 3D extent of objects. As a result, object-object interactions, such as occlusions or ground-plane contact, can be represented only superficially. In this paper, we approach the problem of scene understanding from the perspective of 3D shape modeling, and design a 3D scene representation that reasons jointly about the 3D shape of multiple objects. This representation allows to express 3D geometry and occlusion on the fine detail level of individual vertices of 3D wireframe models, and makes it possible to treat dependencies between objects, such as occlusion reasoning, in a deterministic way. In our experiments, we demonstrate the benefit of jointly estimating the 3D shape of multiple objects in a scene over working with coarse boxes, on the recently proposed KITTI dataset of realistic street scenes.
ISSN:1063-6919
DOI:10.1109/CVPR.2014.470