EANTrack: An Efficient Attention Network for Visual Tracking

Recently, Siamese trackers have gained widespread attention in visual tracking due to their exceptional performance. However, many trackers still suffer from limitations in challenging scenarios, such as fast motion and scale variation, which hinder the full exploitation of target features. Conseque...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on automation science and engineering Vol. 21; no. 4; pp. 1 - 18
Main Authors Gu, Fengwei, Lu, Jun, Cai, Chengtao, Zhu, Qidan, Ju, Zhaojie
Format Journal Article
LanguageEnglish
Published IEEE 03.10.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Recently, Siamese trackers have gained widespread attention in visual tracking due to their exceptional performance. However, many trackers still suffer from limitations in challenging scenarios, such as fast motion and scale variation, which hinder the full exploitation of target features. Consequently, the accuracy and efficiency of the trackers are limited. Therefore, this paper proposes an efficient attention network, called EAN, to improve tracking performance. The EAN comprises three primary components, namely a Transformer-s subnetwork, a Transformer-t subnetwork, and a Feature-Fused Attention Module (FFAM). The designed Transformer-s and Transformer-t subnetworks adopt complementary structures and functions to fully integrate and emphasize the relevant feature information, including channel and spatial features. The FFAM is responsible for fusing the multi-level features from both subnetworks, which establishes the global dependencies between the templates and search regions and enhances the discriminative power of the model. To further improve the tracking accuracy, a novel Feature-Aware Attention Module (FAAM) is introduced into the tracking prediction head to enhance the feature representation capability of the model. Finally, we propose an efficient EANTrack tracker based on EAN for robust tracking in complex scenarios, which exhibits significant advantages in challenging attributes. Experimental results on multiple benchmarks indicate that our approach achieves remarkable tracking performance with a real-time running speed of 55.6fps. Note to Practitioners -Siamese trackers have garnered considerable attention in the field of visual tracking due to their impressive performance. However, these trackers often face limitations in challenging scenarios, which impede the complete exploitation of target features. As a result, the accuracy and efficiency of many trackers are compromised. To address these issues, we propose an efficient tracker called EANTrack to enable robust tracking in complex scenarios. Our EANTrack exhibits significant advantages in handling challenging attributes. Please refer to our complete paper for detailed information on the EANTrack tracker and experimental results. Practitioners in the field can benefit from our research by leveraging our findings and methodologies in their work. We encourage further exploration and experimentation to enhance the performance and applicability of visual tracking systems.
ISSN:1545-5955
1558-3783
DOI:10.1109/TASE.2023.3319676