Semantic Segmentation Network combining Gaussian Perception and Iterative Multi-Scale Attention

Semantic segmentation is crucial in autonomous driving, offering exceptional scene understanding to tackle challenges like small object edges and blurred textures in complex traffic environments. By performing pixel-level classification, it provides vehicles with comprehensive environmental informat...

Full description

Saved in:

Bibliographic Details
Published in	Multimedia systems Vol. 31; no. 5
Main Authors	Ke, Zunwang, Wang, Guosheng, Zhang, Yugui, Shi, YunLong, Guo, Fengyu, Zou, Yuelin, Li, Zhaofan, Guo, Run, Zhou, Ji-Sheng
Format	Journal Article
Language	English
Published	Berlin/Heidelberg Springer Berlin Heidelberg 01.10.2025 Springer Nature B.V
Subjects	Attention Computer Communication Networks Computer Graphics Computer Science Cryptology Data Storage Representation Feature extraction Image segmentation Multimedia Information Systems Operating Systems Perception Pixels Regular Paper Scene analysis Semantic segmentation Semantics Source code Local–global features Gaussian function Semantic segmentation Spatial and channel attention mechanisms
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Semantic segmentation is crucial in autonomous driving, offering exceptional scene understanding to tackle challenges like small object edges and blurred textures in complex traffic environments. By performing pixel-level classification, it provides vehicles with comprehensive environmental information, ensuring safe navigation. However, when dealing with fine, fuzzy-bordered objects in complex scenes, the existing techniques face the issue of low segmentation accuracy due to their insufficient feature extraction capability. To address this problem, this study proposes a semantic segmentation network that combines Gaussian perception with iterative multi-scale attention. The method integrates Gaussian perception with local–global channel attention, accurately models pixel associations, and dynamically focuses on features to address the issue of edge blurring in complex scene segmentation. At the same time, the method employs the difference module to enhance the low-frequency features, and integrates the iterative multi-scale attention mechanism to achieve deep integration of low-frequency and high-frequency information. This enhances the fine capture of features and mitigates the edge discontinuity issue in segmentation caused by the masking of boundary information. In addition, the method combines channel and spatial attention to optimize feature extraction, enhance the sensory field, and improve detail, context, and boundary recognition abilities. This significantly improves the feature expression ability and reduces the probability of background mis-segmentation. The experimental results show that the proposed method achieves 79.34% mIoU on the Cityscapes validation set (a 1.32% improvement over PIDNet-S) and 81.48% mIoU on the CamVid test set (a 1.05% improvement over PIDNet-S). These results demonstrate significant improvements over existing state-of-the-art methods, confirming the effectiveness of this approach in semantic segmentation of complex urban scenes. The source code has been made publicly available on GitHub: https://github.com/wgsheng897/GMSANet.git
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0942-4962 1432-1882
DOI:	10.1007/s00530-025-01923-1