SRRNet: A Semantic Representation Refinement Network for Image Segmentation

Semantic context has raised concerns in semantic segmentation. In most cases, it is applied to guide feature learning. Instead, this paper applies it to extract the semantic representation, which records the global feature information of each category with a memory tensor. Specifically, we propose a...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on multimedia Vol. 25; pp. 5720 - 5732
Main Authors	Ding, Xiaofeng, Zeng, Tieyong, Tang, Jian, Che, Zhengping, Peng, Yaxin
Format	Journal Article
Language	English
Published	Piscataway IEEE 2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Aggregates Context Context modeling convolutional neural networks Embedding Feature extraction Image segmentation Modules Optical character recognition software Representations Segmentation semantic attention semantic embedding semantic representation Semantic segmentation Semantics Tensors
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Semantic context has raised concerns in semantic segmentation. In most cases, it is applied to guide feature learning. Instead, this paper applies it to extract the semantic representation, which records the global feature information of each category with a memory tensor. Specifically, we propose a novel semantic representation (SR) module, which consists of semantic embedding (SE) and semantic attention (SA) blocks. The SE block adaptively embeds features into the semantic representation by calculating the memory similarity, and the SA block aggregates the embedded features with semantic attention. The main advantages of the SR module lie in three aspects: i) it enhances the representation ability of semantic context by employing global (cross-image) semantic information; ii) it improves the consistency of intraclass features by aggregating global features of the same categories; and iii) it can be extended to build a semantic representation refinement network (SRRNet) by iteratively applying the SR module across multiple scales, shrinking the semantic gap and enhancing the structural reasoning of the model. Extensive experiments demonstrate that our method significantly improves the segmentation results and achieves superior performance on the PASCAL VOC 2012, Cityscapes, and PASCAL Context datasets.
ISSN:	1520-9210 1941-0077
DOI:	10.1109/TMM.2022.3198314