Semantic Segmentation Network combining Gaussian Perception and Iterative Multi-Scale Attention
Semantic segmentation is crucial in autonomous driving, offering exceptional scene understanding to tackle challenges like small object edges and blurred textures in complex traffic environments. By performing pixel-level classification, it provides vehicles with comprehensive environmental informat...
Saved in:
Published in | Multimedia systems Vol. 31; no. 5 |
---|---|
Main Authors | , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
Berlin/Heidelberg
Springer Berlin Heidelberg
01.10.2025
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Semantic segmentation is crucial in autonomous driving, offering exceptional scene understanding to tackle challenges like small object edges and blurred textures in complex traffic environments. By performing pixel-level classification, it provides vehicles with comprehensive environmental information, ensuring safe navigation. However, when dealing with fine, fuzzy-bordered objects in complex scenes, the existing techniques face the issue of low segmentation accuracy due to their insufficient feature extraction capability. To address this problem, this study proposes a semantic segmentation network that combines Gaussian perception with iterative multi-scale attention. The method integrates Gaussian perception with local–global channel attention, accurately models pixel associations, and dynamically focuses on features to address the issue of edge blurring in complex scene segmentation. At the same time, the method employs the difference module to enhance the low-frequency features, and integrates the iterative multi-scale attention mechanism to achieve deep integration of low-frequency and high-frequency information. This enhances the fine capture of features and mitigates the edge discontinuity issue in segmentation caused by the masking of boundary information. In addition, the method combines channel and spatial attention to optimize feature extraction, enhance the sensory field, and improve detail, context, and boundary recognition abilities. This significantly improves the feature expression ability and reduces the probability of background mis-segmentation. The experimental results show that the proposed method achieves 79.34% mIoU on the Cityscapes validation set (a 1.32% improvement over PIDNet-S) and 81.48% mIoU on the CamVid test set (a 1.05% improvement over PIDNet-S). These results demonstrate significant improvements over existing state-of-the-art methods, confirming the effectiveness of this approach in semantic segmentation of complex urban scenes. The source code has been made publicly available on GitHub:
https://github.com/wgsheng897/GMSANet.git |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 0942-4962 1432-1882 |
DOI: | 10.1007/s00530-025-01923-1 |