Learning attention modules for visual tracking

Siamese networks have been widely used in visual tracking. However, it is difficult to deal with complex appearance variations when the discriminative background information is ignored and an offline training strategy is adopted. In this paper, we present a novel backbone network based on CNN model...

Full description

Saved in:

Bibliographic Details
Published in	Signal, image and video processing Vol. 16; no. 8; pp. 2149 - 2156
Main Authors	Wang, Jun, Meng, Chenchen, Deng, Chengzhi, Wang, Yuanyun
Format	Journal Article
Language	English
Published	London Springer London 01.11.2022 Springer Nature B.V
Subjects	Algorithms Artificial neural networks Computer Imaging Computer networks Computer Science Image Processing and Computer Vision Modules Multimedia Information Systems Optical tracking Original Paper Pattern Recognition and Graphics Signal,Image and Speech Processing Training Vision Siamese networks Visual tracking Channel attention Spatial attention
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Siamese networks have been widely used in visual tracking. However, it is difficult to deal with complex appearance variations when the discriminative background information is ignored and an offline training strategy is adopted. In this paper, we present a novel backbone network based on CNN model and attention mechanism in the Siamese framework. The attention mechanism is composed of a channel attention module and a spatial attention module. The channel attention module uses the learned global information to selectively focus on the convolution features, which enhances a network representation ability. Besides, the spatial attention module obtains more contextual information and semantic features of target candidates. The designed attention mechanism-based backbone is lightweight and has a real-time tracking performance. We utilize GOT-10K as a training set to offline adjust trained model parameters. The extensive experimental evaluations on OTB2015, VOT2016, VOT2018, GOT-10k and UAV123 datasets demonstrate that the proposed algorithm has excellent performances against state-of-the-art trackers.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1863-1703 1863-1711
DOI:	10.1007/s11760-022-02177-4