Spatio-temporal interactive fusion based visual object tracking method

Visual object tracking tasks often struggle with utilizing inter-frame correlation information and handling challenges like local occlusion, deformations, and background interference. To address these issues, this paper proposes a spatio-temporal interactive fusion (STIF) based visual object trackin...

Full description

Saved in:
Bibliographic Details
Published inFrontiers in physics Vol. 11
Main Authors Huang, Dandan, Yu, Siyu, Duan, Jin, Wang, Yingzhi, Yao, Anni, Wang, Yiwen, Xi, Junhan
Format Journal Article
LanguageEnglish
Published Frontiers Media S.A 29.11.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Visual object tracking tasks often struggle with utilizing inter-frame correlation information and handling challenges like local occlusion, deformations, and background interference. To address these issues, this paper proposes a spatio-temporal interactive fusion (STIF) based visual object tracking method. The goal is to fully utilize spatio-temporal background information, enhance feature representation for object recognition, improve tracking accuracy, adapt to object changes, and reduce model drift. The proposed method incorporates feature-enhanced networks in both temporal and spatial dimensions. It leverages spatio-temporal background information to extract salient features that contribute to improved object recognition and tracking accuracy. Additionally, the model’s adaptability to object changes is enhanced, and model drift is minimized. A spatio-temporal interactive fusion network is employed to learn a similarity metric between the memory frame and the query frame by utilizing feature enhancement. This fusion network effectively filters out stronger feature representations through the interactive fusion of information. The proposed tracking method is evaluated on four challenging public datasets. The results demonstrate that the method achieves state-of-the-art (SOTA) performance and significantly improves tracking accuracy in complex scenarios affected by local occlusion, deformations, and background interference. Finally, the method achieves a remarkable success rate of 78.8% on TrackingNet, a large-scale tracking dataset.
ISSN:2296-424X
2296-424X
DOI:10.3389/fphy.2023.1269638