Dual-branch adaptive attention transformer for occluded person re-identification

•An end-to-end dual-branch vision transformer for occluded person re-identification is proposed.•Adaptive extraction of human local features using self-Attention mechanism is achieved.•Goal Consistency Loss with more consistent convergence goals is designed.•The State-of-the-Art performance were ach...

Full description

Saved in:
Bibliographic Details
Published inImage and vision computing Vol. 131; p. 104633
Main Authors Lu, Yunhua, Jiang, Mingzi, Liu, Zhi, Mu, Xinyu
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.03.2023
Subjects
Online AccessGet full text
ISSN0262-8856
1872-8138
DOI10.1016/j.imavis.2023.104633

Cover

Loading…
More Information
Summary:•An end-to-end dual-branch vision transformer for occluded person re-identification is proposed.•Adaptive extraction of human local features using self-Attention mechanism is achieved.•Goal Consistency Loss with more consistent convergence goals is designed.•The State-of-the-Art performance were achieved on Occluded-REID dataset. Occluded person re-identification is still a common and challenging task because people are often occluded by some obstacles (e.g. cars and trees) in the real world. In order to locate the unoccluded parts and extract local fine-grained features of the occluded human body, State-of-the-Art (SOTA) methods usually use a pose estimation model, which usually causes additional bias and this two-stage architecture also complicates the model. To solve this problem, an end-to-end dual-branch Transformer network for occluded person re-identification is designed. Specifically, one of the branches is the transformer-based global branch, which is responsible for extracting global features, while in the other local branch, we design the Selective Token Attention (STA) module. STA can utilize the multi-headed self-attention mechanism to select discriminating tokens for effectively extracting the local features. Further, in order to alleviate the inconsistency between Softmax Loss and Triplet Loss convergence goals, Circle Loss is introduced to design the Goal Consistency Loss (GC Loss) to supervise the network. Experiments on four challenging datasets for Re-ID tasks (including occluded person Re-ID and holistic person Re-ID) illustrate that our method can achieve SOTA performance.
ISSN:0262-8856
1872-8138
DOI:10.1016/j.imavis.2023.104633