Dilated Convolutional Pixels Affinity Network for Weakly Supervised Semantic Segmentation

This paper studies semantic segmentation primarily under image‐level weak‐supervision. Most state‐of‐the‐art technologies have recently used deep classification networks to create small and sparse discriminatory seed regions of each interest target as pseudo‐labels for training segmentation networks...

Full description

Saved in:

Bibliographic Details
Published in	Chinese Journal of Electronics Vol. 30; no. 6; pp. 1120 - 1130
Main Authors	Zhe, ZHANG, Bilin, WANG, Zhezhou, YU, Zhiyuan, LI
Format	Journal Article
Language	English
Published	Published by the IET on behalf of the CIE 01.11.2021
Subjects	Convolutional neural networks Dilated convolution Self‐attention mechanism Semantic segmentation Weakly supervised
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper studies semantic segmentation primarily under image‐level weak‐supervision. Most state‐of‐the‐art technologies have recently used deep classification networks to create small and sparse discriminatory seed regions of each interest target as pseudo‐labels for training segmentation networks, which achieve inferior performance compared with the fully supervised setting. We propose a Dilated convolutional pixels affinity network (DCPAN) to localize and expand the seed regions of objects to bridge this gap. Although introduced dilated convolutional units enable capture of additional location information of objects, it falsely highlighted true negative regions as dilated rate enlarge. To address this problem, we properly integrate dilated convolutional units with different dilated rates and self‐attention mechanisms to obtain pixel affinity measure matrix for promoting classification network to generate high‐quality object seed regions as pseudo‐labels; thus, the performance of the segmentation network is boosted. Furthermore, although our approach seems simple, our method obtains a competitive performance, and experiments show that the performance of DCPAN outperforms other state‐of‐art approaches in weakly‐supervised settings, which only use image‐level labels on the Pascal VOC 2012 dataset.
ISSN:	1022-4653 2075-5597
DOI:	10.1049/cje.2021.08.007