A Mutually Supervised Graph Attention Network for Few-Shot Segmentation: The Perspective of Fully Utilizing Limited Samples

Fully supervised semantic segmentation has performed well in many computer vision tasks. However, it is time-consuming because training a model requires a large number of pixel-level annotated samples. Few-shot segmentation has recently become a popular approach to addressing this problem, as it req...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transaction on neural networks and learning systems Vol. 35; no. 4; pp. 1 - 13
Main Authors	Gao, Honghao, Xiao, Junsheng, Yin, Yuyu, Liu, Tong, Shi, Jiangang
Format	Journal Article
Language	English
Published	United States IEEE 01.04.2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Annotations Computer vision Convolution Datasets Feature maps Few-shot learning graph attention network graph reasoning Graph theory Image edge detection Image enhancement Image processing Image segmentation mutually supervised regime (MSR) Neural networks Pixels Queries Semantic segmentation Semantics Spatial data Task analysis Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Fully supervised semantic segmentation has performed well in many computer vision tasks. However, it is time-consuming because training a model requires a large number of pixel-level annotated samples. Few-shot segmentation has recently become a popular approach to addressing this problem, as it requires only a handful of annotated samples to generalize to new categories. However, the full utilization of limited samples remains an open problem. Thus, in this article, a mutually supervised few-shot segmentation network is proposed. First, the feature maps from intermediate convolution layers are fused to enrich the capacity of feature representation. Second, the support image and query image are combined into a bipartite graph, and the graph attention network is adopted to avoid losing spatial information and increase the number of pixels in the support image to guide the query image segmentation. Third, the attention map of the query image is used as prior information to enhance the support image segmentation, which forms a mutually supervised regime. Finally, the attention maps of the intermediate layers are fused and sent into the graph reasoning layer to infer the pixel categories. Experiments are conducted on the PASCAL VOC- 5^i dataset and FSS-1000 dataset, and the results demonstrate the effectiveness and superior performance of our method compared with other baseline methods.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	2162-237X 2162-2388 2162-2388
DOI:	10.1109/TNNLS.2022.3155486