Revisiting Modality-Specific Feature Compensation for Visible-Infrared Person Re-Identification

Although modality-specific feature compensation becomes a prevailing paradigm for Visible-Infrared Person Re-Identification (VI-ReID) to learn features, it, performance-wise, is not promising, especially when compared to modality-shared feature learning. In this paper, by revisiting the modality-spe...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on circuits and systems for video technology Vol. 32; no. 10; pp. 7226 - 7240
Main Authors Liu, Jianan, Wang, Jialiang, Huang, Nianchang, Zhang, Qiang, Han, Jungong
Format Journal Article
LanguageEnglish
Published New York IEEE 01.10.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Although modality-specific feature compensation becomes a prevailing paradigm for Visible-Infrared Person Re-Identification (VI-ReID) to learn features, it, performance-wise, is not promising, especially when compared to modality-shared feature learning. In this paper, by revisiting the modality-specific feature compensation based models, we reveal that the reasons for being under-performed are: (1) generated images of one modality from another modality may be poor in quality; (2) such existing models usually achieve the modality-specific feature compensation just via simple pixel-level fusion strategies; (3) generated images cannot fully replace corresponding missing ones, which brings in extra modality discrepancy. To address these issues, we propose a new Two-Stage Modality Enhancement Network (TSME) for VI-ReID. Concretely, it first considers the modality discrepancy for cross-modality style translation and optimizes the structures of image generators by involving a new Deeper Skip-connection Generative Adversarial Networks (DSGAN) to generate high-quality images. Then, it presents an attention mechanism based feature-level fusion module, i.e., Pair-wise Image Fusion (PwIF) module, and an auxiliary learning module, i.e., Invoking All-Images (IAI) module, to better exploit the generated and original images for reducing modality discrepancy from the perspectives of feature fusion and feature constraints, respectively. Comprehensive experiments are carried out to demonstrate the success of TSME in tackling the modality discrepancy issue exposed in VI-ReID.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1051-8215
1558-2205
DOI:10.1109/TCSVT.2022.3168999