Multi-Task Hierarchical Feature Learning for Real-Time Visual Tracking

Recently, the tracking community leads a fashion of end-to-end feature learning using convolutional neural networks (CNNs) for visual object tracking. Traditional trackers extract feature maps from the last convolutional layer of CNNs for feature representation. This single-layer representation igno...

Full description

Saved in:
Bibliographic Details
Published inIEEE sensors journal Vol. 19; no. 5; pp. 1961 - 1968
Main Authors Kuai, Yangliu, Wen, Gongjian, Li, Dongdong
Format Journal Article
LanguageEnglish
Published New York IEEE 01.03.2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Recently, the tracking community leads a fashion of end-to-end feature learning using convolutional neural networks (CNNs) for visual object tracking. Traditional trackers extract feature maps from the last convolutional layer of CNNs for feature representation. This single-layer representation ignores target information captured in the earlier convolutional layers. In this paper, we propose a novel hierarchical feature learning framework, which captures both high-level semantics and low-level spatial details using multi-task learning. Particularly, feature maps extracted from both the shallow layer and the deep layer are input into a correlation filter layer to encode fine-grained geometric cues and coarse-grained semantic cues, respectively. Our network performs these two feature learning tasks with a multi-task learning strategy. We conduct extensive experiments on three popular tracking datasets, including OTB, UAV123, and VOT2016. Experimental results show that our method achieves remarkable performance improvement while running in real time.
ISSN:1530-437X
1558-1748
DOI:10.1109/JSEN.2018.2883593