Co-attention dictionary network for weakly-supervised semantic segmentation
In this paper, we propose the co-attention dictionary network (CODNet) for weakly-supervised semantic segmentation using only image-level class labels. The CODNet model exploits extra semantic information by jointly leveraging a pair of samples with common semantics through co-attention rather than...
Saved in:
Published in | Neurocomputing (Amsterdam) Vol. 486; pp. 272 - 285 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
14.05.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In this paper, we propose the co-attention dictionary network (CODNet) for weakly-supervised semantic segmentation using only image-level class labels. The CODNet model exploits extra semantic information by jointly leveraging a pair of samples with common semantics through co-attention rather than processing them independently. The inter-sample similarities of spatially distributed deep features are computed to merge reference features through non-local connections. To discover similar patterns regardless of appearance variations, we propose to extract image representations by equipping the neural networks with dictionary learning which provides the universal basis elements for different images. Based on the CODNet model, we propose a multi-reference class activation map (MR-CAM) algorithm which generates semantic segmentation masks for a target image by jointly merging semantic cues from multiple reference images. Experimental results on the PASCAL VOC 2012 and MSCOCO benchmark datasets for weakly-supervised semantic segmentation show that the proposed algorithm performs favorably against the state-of-the-art methods. |
---|---|
ISSN: | 0925-2312 1872-8286 |
DOI: | 10.1016/j.neucom.2021.11.046 |