Cross channel aggregation similarity network for salient object detection

Salient object detection is an efficient preprocessing technique to deal with binary segmentation task. Existing works based on deep learning have achieved an enormous leap forward with outstanding performance in the field of computer vision. Most of the previous methods mainly adopted multi-scale f...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of machine learning and cybernetics Vol. 13; no. 8; pp. 2153 - 2169
Main Authors Chen, Liyuan, Liu, Huawen, Mo, Jiashuaizi, Zhang, Dawei, Yang, Jie, Lin, Feilong, Zheng, Zhonglong, Jia, Riheng
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer Berlin Heidelberg 01.08.2022
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Salient object detection is an efficient preprocessing technique to deal with binary segmentation task. Existing works based on deep learning have achieved an enormous leap forward with outstanding performance in the field of computer vision. Most of the previous methods mainly adopted multi-scale fusion and attention mechanisms to facilitate efficient feature extraction yet ignored necessary global context characteristics and general models computational limitation. To mitigate the adverse effects of feature dilution during the top-to-down transmission, we propose a cross channel aggregation similarity network (CCANet) with three modules. Cross channel aggregation module retains high-response channels from integrated different layer feature maps to extract efficient global context information. Similarity fusion module calculates the similarity among various features consisting of high-level semantic, low-level spatial, and global context information to enhance the complementary of maps. Dense residual module extracts denser features under multi-scale receptive fields to improve the density of prediction maps. Besides, a combined loss function with modified weighted binary cross-entropy is applied to alleviate the class imbalance issue incurred in the training process. Benefited from the overall harmonious design, experimental results show that CCANet achieves state-of-the-art performance on six public benchmark datasets. Without any post-processing operations, it runs real-time inference at a speed of around 32 FPS when processing a 320 × 320 image.
ISSN:1868-8071
1868-808X
DOI:10.1007/s13042-022-01512-y