Hierarchical Tracking by Reinforcement Learning-Based Searching and Coarse-to-Fine Verifying
A class-agnostic tracker typically consists of three key components, i.e., its motion model, its target appearance model, and its updating strategy. However, most recent top-performing trackers mainly focus on constructing complicated appearance models and updating strategies, while using comparativ...
Saved in:
Published in | IEEE transactions on image processing Vol. 28; no. 5; pp. 2331 - 2341 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
United States
IEEE
01.05.2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | A class-agnostic tracker typically consists of three key components, i.e., its motion model, its target appearance model, and its updating strategy. However, most recent top-performing trackers mainly focus on constructing complicated appearance models and updating strategies, while using comparatively simple and heuristic motion models that may result in an inefficient search and degrade the tracking performance. To address this issue, we propose a hierarchical tracker that learns to move and track based on the combination of data-driven search at the coarse level and coarse-to-fine verification at the fine level. At the coarse level, a data-driven motion model learned from deep recurrent reinforcement learning provides our tracker with coarse localization of an object. By formulating motion search as an action-decision problem in reinforcement learning, our tracker utilizes a recurrent convolutional neural network-based deep Q-network to effectively learn data-driven searching policies. The learned motion model can not only significantly reduce the search space but also provide more reliable interested regions for further verifying. At the fine level, a kernelized correlation filter (KCF)-based appearance model is adopted to densely yet efficiently verify a local region centered on the predicted location from the motion model. Through use of circulant matrices and fast Fourier transformation, a large number of candidate samples in the local region can be efficiently and effectively evaluated by the KCF-based appearance model. Finally, a simple yet robust estimator is designed to analyze possible tracking failure. The experiments on OTB50 and OTB100 illustrate that our tracker achieves better performance than the state-of-the-art trackers. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 1057-7149 1941-0042 |
DOI: | 10.1109/TIP.2018.2885238 |