MDFN: Multi-scale deep feature learning network for object detection

•The paper proposes a new model that focuses on learning the deep features produced in the latter part of the network.•Accurate detection results are achieved by making full use of the semantic and contextual information expressed by deep features.•The proposed deep feature learning inception module...

Full description

Saved in:

Bibliographic Details
Published in	Pattern recognition Vol. 100; p. 107149
Main Authors	Ma, Wenchi, Wu, Yuanwei, Cen, Feng, Wang, Guanghui
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.04.2020
Subjects	Deep feature learning Multi-scale Semantic and contextual information Small and occluded objects Deep feature learning Multi-scale Semantic and contextual information Small and occluded objects
Online Access	Get full text

Cover

Loading…

More Information
Summary:	•The paper proposes a new model that focuses on learning the deep features produced in the latter part of the network.•Accurate detection results are achieved by making full use of the semantic and contextual information expressed by deep features.•The proposed deep feature learning inception modules activate multi-scale receptive fields within a wide range at a single layer level.•The paper demonstrates that features produced in the deeper part of networks have a prevailing impact on the accuracy of object detection. This paper proposes an innovative object detector by leveraging deep features learned in high-level layers. Compared with features produced in earlier layers, the deep features are better at expressing semantic and contextual information. The proposed deep feature learning scheme shifts the focus from concrete features with details to abstract ones with semantic information. It considers not only individual objects and local contexts but also their relationships by building a multi-scale deep feature learning network (MDFN). MDFN efficiently detects the objects by introducing information square and cubic inception modules into the high-level layers, which employs parameter-sharing to enhance the computational efficiency. MDFN provides a multi-scale object detector by integrating multi-box, multi-scale and multi-level technologies. Although MDFN employs a simple framework with a relatively small base network (VGG-16), it achieves better or competitive detection results than those with a macro hierarchical structure that is either very deep or very wide for stronger ability of feature extraction. The proposed technique is evaluated extensively on KITTI, PASCAL VOC, and COCO datasets, which achieves the best results on KITTI and leading performance on PASCAL VOC and COCO. This study reveals that deep features provide prominent semantic information and a variety of contextual contents, which contribute to its superior performance in detecting small or occluded objects. In addition, the MDFN model is computationally efficient, making a good trade-off between the accuracy and speed.
ISSN:	0031-3203 1873-5142
DOI:	10.1016/j.patcog.2019.107149