Deep saliency detection via spatial-wise dilated convolutional attention

Saliency detection aims to highlight the area which significantly attracts human attention and stands out in an image. In recent years, deep learning-based saliency detection has achieved fantastic performance over conventional works while still facing huge challenges in multi-features fusion and th...

Full description

Saved in:

Bibliographic Details
Published in	Neurocomputing (Amsterdam) Vol. 445; pp. 35 - 49
Main Authors	Cui, Wenzhao, Zhang, Qing, Zuo, Baochuan
Format	Journal Article
Language	English
Published	Elsevier B.V 20.07.2021
Subjects	Attention mechanisms Dilated convolution Multi-scale features Salient object detection Dilated convolution Attention mechanisms Salient object detection Multi-scale features
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Saliency detection aims to highlight the area which significantly attracts human attention and stands out in an image. In recent years, deep learning-based saliency detection has achieved fantastic performance over conventional works while still facing huge challenges in multi-features fusion and the enlargement of the receptive field. Current top-performing saliency detectors on the basis of FCNs benefit from their powerful feature representations but suffer from high computational costs due to the integration of multi-scale features without distinction. So in this paper, we propose a novel and simple network, the DCAM, based on attention mechanism with dilated convolutions (DAM), incorporating multi-scale features with enlarged receptive field. Specifically, we apply DAM to guide each side output respectively which selectively emphasizes the significant regions, thus efficiently enhancing the representation ability of each layer. Our spatial attention module helps us looking for areas in the image that have a greater impact and give them a higher weight.Besides, we adopt FPN to integrate features adjacent to each other layer and a CRF scheme for refining saliency results. Experiments on five benchmark datasets demonstrate that the proposed approach performs favorably against five state-of-the-art methods with a fast speed (56 FPS on a single GPU).
ISSN:	0925-2312 1872-8286
DOI:	10.1016/j.neucom.2021.02.061