Deep MANTA: A Coarse-to-Fine Many-Task Network for Joint 2D and 3D Vehicle Analysis from Monocular Image

In this paper, we present a novel approach, called Deep MANTA (Deep Many-Tasks), for many-task vehicle analysis from a given image. A robust convolutional network is introduced for simultaneous vehicle detection, part localization, visibility characterization and 3D dimension estimation. Its archite...

Full description

Saved in:

Bibliographic Details
Published in	2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 1827 - 1836
Main Authors	Chabot, Florian, Chaouch, Mohamed, Rabarisoa, Jaonary, Teuliere, Celine, Chateau, Thierry
Format	Conference Proceeding
Language	English
Published	IEEE 01.07.2017
Subjects	Object detection Pose estimation Proposals Shape Solid modeling Three-dimensional displays Two dimensional displays
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, we present a novel approach, called Deep MANTA (Deep Many-Tasks), for many-task vehicle analysis from a given image. A robust convolutional network is introduced for simultaneous vehicle detection, part localization, visibility characterization and 3D dimension estimation. Its architecture is based on a new coarse-to-fine object proposal that boosts the vehicle detection. Moreover, the Deep MANTA network is able to localize vehicle parts even if these parts are not visible. In the inference, the networks outputs are used by a real time robust pose estimation algorithm for fine orientation estimation and 3D vehicle localization. We show in experiments that our method outperforms monocular state-of-the-art approaches on vehicle detection, orientation and 3D location tasks on the very challenging KITTI benchmark.
ISSN:	1063-6919 1063-6919
DOI:	10.1109/CVPR.2017.198