RGB-T salient object detection via excavating and enhancing CNN features

RGB-T salient object detection aims to identify the most attractive object(s) in a scene using RGB and thermal data. For this task, on the one hand, how to excavate salient clues is crucial to improve the detection performance of the model. On the other hand, strengthening the representation of sali...

Full description

Saved in:

Bibliographic Details
Published in	Applied intelligence (Dordrecht, Netherlands) Vol. 53; no. 21; pp. 25543 - 25561
Main Authors	Bi, Hongbo, Zhang, Jiayuan, Wu, Ranwan, Tong, Yuyu, Fu, Xiaowei, Shao, Keyong
Format	Journal Article
Language	English
Published	New York Springer US 01.11.2023 Springer Nature B.V
Subjects	Artificial Intelligence Computer Science Deep learning Design Machines Manufacturing Mechanical Engineering Modules Neural networks Object recognition Processes Representations Researchers Salience Semantics Joint attention Feature enhancement RGB-T salient object detection
Online Access	Get full text

Cover

Loading…

More Information
Summary:	RGB-T salient object detection aims to identify the most attractive object(s) in a scene using RGB and thermal data. For this task, on the one hand, how to excavate salient clues is crucial to improve the detection performance of the model. On the other hand, strengthening the representation of salient features is still a huge challenge. In order to solve the above issues, this paper proposes excavating and enhancing CNN features to boost the performance of RGB-T salient object detection ( E 2 N e t ). Specifically, we first design a new joint attention module (JAM) that jointly considers both dimensions of channel and pixel location to fully explore effective salient features by encoding contextual information as local features and simultaneously digging for useful clues from inside the channel. Different from previous works of expanding the feeling field, to enhance the feature representation, we propose a feature enhancement module (FEM), which realizes the parallel and independent learning of the four branches based on the channel splitting strategy, and greatly consolidates the feature expression ability. Extensive experiments show that our proposed model outperforms the existing twenty-two excellent models on three challenging benchmark datasets. Codes and results are available at: https://github.com/RanwanWu/E2Net .
ISSN:	0924-669X 1573-7497
DOI:	10.1007/s10489-023-04784-1