Efficient Context-Guided Stacked Refinement Network for RGB-T Salient Object Detection

RGB-T salient object detection (SOD) aims at utilizing the complementary cues of RGB and Thermal (T) modalities to detect and segment the common objects. However, on one hand, existing methods simply fuse the features of two modalities without fully considering the characters of RGB and T. On the ot...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on circuits and systems for video technology Vol. 32; no. 5; pp. 3111 - 3124
Main Authors Huo, Fushuo, Zhu, Xuegui, Zhang, Lei, Liu, Qifeng, Shu, Yu
Format Journal Article
LanguageEnglish
Published New York IEEE 01.05.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:RGB-T salient object detection (SOD) aims at utilizing the complementary cues of RGB and Thermal (T) modalities to detect and segment the common objects. However, on one hand, existing methods simply fuse the features of two modalities without fully considering the characters of RGB and T. On the other hand, the high computational cost of existing methods prevents them from real-world applications (e.g., automatic driving, abnormal detection, person re-ID). To this end, we proposed an efficient encoder-decoder network named Context-guided Stacked Refinement Network (CSRNet). Specifically, we utilize a lightweight backbone and design efficient decoder parts, which greatly reduce the computational cost. To fuse RGB and T modalities, we proposed an efficient Context-guided Cross Modality Fusion (CCMF) module to filter the noise and explore the complementation of two modalities. Besides, Stacked Refinement Network (SRN) progressively refines the features from top to down via the interaction of semantic and spatial information. Extensive experiments show that our method performs favorably against state-of-the-art algorithms on RGB-T SOD task while with small model size (4.6M), few FLOPs (4.2G), and real-time speed (38 fps ). Our codes is available at: https://github.com/huofushuo/CSRNet .
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1051-8215
1558-2205
DOI:10.1109/TCSVT.2021.3102268