Lightweight Bilateral Network for Real-Time Semantic Segmentation

Herein, a dual-branch semantic segmentation model based on depth-separable convolution and attention mechanism is proposed for the real-time and accuracy requirement of semantic segmentation. The proposed approach overcomes the problems of poor segmentation effect and over-simplification of feature...

Full description

Saved in:

Bibliographic Details
Published in	Journal of advanced computational intelligence and intelligent informatics Vol. 27; no. 4; pp. 673 - 682
Main Authors	Wang, Pengtao, Li, Lihong, Pan, Feiyang, Wang, Lin
Format	Journal Article
Language	English
Published	Tokyo Fuji Technology Press Co. Ltd 01.07.2023
Subjects	Accuracy Algorithms Convolution Modules Real time Semantic segmentation Semantics Spatial data
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Herein, a dual-branch semantic segmentation model based on depth-separable convolution and attention mechanism is proposed for the real-time and accuracy requirement of semantic segmentation. The proposed approach overcomes the problems of poor segmentation effect and over-simplification of feature fusion arising from the constant downsample operations in semantic segmentation. The network is divided into spatial detail and semantic information paths. The spatial detail path utilizes a smaller downsample multiplier to maintain resolution and efficiently extract spatial information. The semantic information path is constructed by a non-bottleneck residual unit with dilated convolution; it extracts semantic features. For the feature aggregation problem, the feature-guided fusion module is designed to assign different weights to the parts of the two paths and fuse them to obtain the final output. The proposed algorithm achieves a segmentation accuracy of 69.6% and speed of 70 fps on the Cityscapes dataset, with a model parameter count of only 0.76 M, thus indicating some advantages over recent real-time semantic segmentation algorithms. The proposed method with depth separable convolution and attention mechanism can effectively extract features and compensate for the loss of accuracy caused by downsampling. The experiments demonstrate that the proposed fusion module outperforms other methods in fusing different features.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1343-0130 1883-8014
DOI:	10.20965/jaciii.2023.p0673