LSEPNet:Joint Prediction of Disparity and Semantics Based on Binocular Vision

Visual perception networks are often used to provide the necessary environmental information such as semantics and depth for mobile robots or autonomous driving. However, these networks can usually only complete one task. In practical applications, people often hope that an visual perception network...

Full description

Saved in:

Bibliographic Details
Published in	2023 2nd International Symposium on Control Engineering and Robotics (ISCER) pp. 1 - 7
Main Authors	Yang, Yang, Gao, Hongxia, Chen, An, Ma, Jianliang, Liang, Guoheng, Liu, Jiegeng
Format	Conference Proceeding
Language	English
Published	IEEE 01.02.2023
Subjects	Estimation Feature extraction Machine vision multitask learning real-time Real-time systems Semantic segmentation Semantics stereo matching Visualization
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Visual perception networks are often used to provide the necessary environmental information such as semantics and depth for mobile robots or autonomous driving. However, these networks can usually only complete one task. In practical applications, people often hope that an visual perception network can complete multiple tasks at the same time. In this work, We propose a lightweight visual perception network called LSEPNet with two branches of semantic segmentation and disparity estimation based on binocular vision system, which is composed of lightweight feature extraction module, feature fusion module based on attention mechanism, semantic decoder and disparity refinement module. Through feature fusion and joint loss function, LSEPNet can jointly train disparity estimation and semantic segmentation tasks end-to-end in real time. We evaluate the proposed network LSEPNet on KITTI2015 and Cityscapes datasets, and compare it with classic disparity estimation network and semantic segmentation network respectively, proving that LSEPNet can complete semantic segmentation task and disparity estimation task together in real-time with acceptable speed and accuracy.
DOI:	10.1109/ISCER58777.2023.00006