Multi-scale volumes for deep object detection and localization
This study aims to analyze the benefits of improved multi-scale reasoning for object detection and localization with deep convolutional neural networks. To that end, an efficient and general object detection framework which operates on scale volumes of a deep feature pyramid is proposed. In contrast...
Saved in:
Published in | Pattern recognition Vol. 61; pp. 557 - 572 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Elsevier Ltd
01.01.2017
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | This study aims to analyze the benefits of improved multi-scale reasoning for object detection and localization with deep convolutional neural networks. To that end, an efficient and general object detection framework which operates on scale volumes of a deep feature pyramid is proposed. In contrast to the proposed approach, most current state-of-the-art object detectors operate on a single-scale in training, while testing involves independent evaluation across scales. One benefit of the proposed approach is in better capturing of multi-scale contextual information, resulting in significant gains in both detection performance and localization quality of objects on the PASCAL VOC dataset and a multi-view highway vehicles dataset. The joint detection and localization scale-specific models are shown to especially benefit detection of challenging object categories which exhibit large scale variation as well as detection of small objects.
•Multi-scale feature reasoning for deep object detection in images is analyzed.•A multi-scale contextual reasoning approach is proposed using multi-scale volumes.•Scale-specific, joint detection and localization models increase robustness.•The approach efficiently handles challenging cases of large variation in scale. |
---|---|
ISSN: | 0031-3203 1873-5142 |
DOI: | 10.1016/j.patcog.2016.06.002 |