Learning convolutional self-attention module for unmanned aerial vehicle tracking

Siamese network-based trackers have been proven to maintain splendid performance. Recently, visual tracking has been applied in unmanned aerial vehicle(UAV) tasks. However, it is a challenging task because of the influences by aspect ratio changes, out-of-view and scale variation, etc. Some Siamese-...

Full description

Saved in:
Bibliographic Details
Published inSignal, image and video processing Vol. 17; no. 5; pp. 2323 - 2331
Main Authors Wang, Jun, Meng, Chenchen, Deng, Chengzhi, Wang, Yuanyun
Format Journal Article
LanguageEnglish
Published London Springer London 01.07.2023
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Siamese network-based trackers have been proven to maintain splendid performance. Recently, visual tracking has been applied in unmanned aerial vehicle(UAV) tasks. However, it is a challenging task because of the influences by aspect ratio changes, out-of-view and scale variation, etc. Some Siamese-based trackers ignore context-related information generated in the time dimension of continuous frames, lose a lot of foreground information and generate redundant background information. In this paper, we propose a novel the feature fusion network based on convolutional self-attention blocks. The convolutional self-attention blocks are composed of ResNet bottleneck blocks with multi-head self-attention (MHSA) blocks. We eliminate the spatial ( 3 × 3 ) convolution operator limitation through the MHSA blocks in the last stage bottleneck blocks of ResNet. Convolutional self-attention blocks capture the global context-related information of the given target images and further improve the accuracy of global match between a given target and a search region. Extensive experimental evaluations on OTB2015 and four UAV benchmarks, i.e., UAV123, UAV20L, DTB70 and UAV123@10fps. The experimental results demonstrate that the proposed tracker can achieve excellent performances against SOTA trackers for UAV tracking and lead to real-time average tracking speed of 181fps on a single GPU.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1863-1703
1863-1711
DOI:10.1007/s11760-022-02449-z