Siamese Attentional Cascade Keypoints Network for Visual Object Tracking

Visual object tracking is urgent yet challenging work since it requires the simultaneous and effective classification and estimation of a target. Thus, research on tracking has been attracting a considerable amount of attention despite the limitations of existing trackers owing to deformation, occlu...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 9; pp. 7243 - 7254
Main Authors Wang, Ershen, Wang, Donglei, Huang, Yufeng, Tong, Gang, Xu, Song, Pang, Tao
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Visual object tracking is urgent yet challenging work since it requires the simultaneous and effective classification and estimation of a target. Thus, research on tracking has been attracting a considerable amount of attention despite the limitations of existing trackers owing to deformation, occlusion and motion. For most current tracking methods, researchers have proposed various ways to adopt a multi-scale search or anchors for estimation, but these methods always need prior knowledge and too many hyperparameters. To address these issues, we proposed a novel Siamese Attentional Cascade Keypoints Tracking Network named SiamACN to exactly track the object by using keypoints prediction instead of anchors. Compared to complex target prediction, the anchor-free method is performed to avoid plaguy hyperparameters, and a simplified hourglass network with global attention is considered the backbone to improve the tracking efficiency. Further, our framework uses keypoints prediction around the target with cascade corner pooling to simplify the model. To certificate the superiority of our framework, extensive tests are conducted on five tracking benchmarks, including OTB-2015, VOT-2016, VOT-2018, LaSOT and UAV123. Our method achieves the leading performance with an accuracy of 61.2% on VOT2016 and favorably runs at 32 FPS against other competing algorithms, which confirms its effectiveness in real-time applications.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2020.3046731