LSEPNet:Joint Prediction of Disparity and Semantics Based on Binocular Vision

Visual perception networks are often used to provide the necessary environmental information such as semantics and depth for mobile robots or autonomous driving. However, these networks can usually only complete one task. In practical applications, people often hope that an visual perception network...

Full description

Saved in:
Bibliographic Details
Published in2023 2nd International Symposium on Control Engineering and Robotics (ISCER) pp. 1 - 7
Main Authors Yang, Yang, Gao, Hongxia, Chen, An, Ma, Jianliang, Liang, Guoheng, Liu, Jiegeng
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.02.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Visual perception networks are often used to provide the necessary environmental information such as semantics and depth for mobile robots or autonomous driving. However, these networks can usually only complete one task. In practical applications, people often hope that an visual perception network can complete multiple tasks at the same time. In this work, We propose a lightweight visual perception network called LSEPNet with two branches of semantic segmentation and disparity estimation based on binocular vision system, which is composed of lightweight feature extraction module, feature fusion module based on attention mechanism, semantic decoder and disparity refinement module. Through feature fusion and joint loss function, LSEPNet can jointly train disparity estimation and semantic segmentation tasks end-to-end in real time. We evaluate the proposed network LSEPNet on KITTI2015 and Cityscapes datasets, and compare it with classic disparity estimation network and semantic segmentation network respectively, proving that LSEPNet can complete semantic segmentation task and disparity estimation task together in real-time with acceptable speed and accuracy.
DOI:10.1109/ISCER58777.2023.00006