An FPGA-based real-time occlusion robust stereo vision system using semi-global matching

Stereo matching approaches are an appealing choice for acquiring depth information in a number of video processing applications. It is desirable that these solutions generate dense, robust disparity maps in real time. However, occlusion regions may disturb the applications that need these maps. Amon...

Full description

Saved in:
Bibliographic Details
Published inJournal of real-time image processing Vol. 17; no. 5; pp. 1447 - 1468
Main Authors Cambuim, Lucas F. S., Oliveira, Luiz A., Barros, Edna N. S., Ferreira, Antonyus P. A.
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer Berlin Heidelberg 01.10.2020
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Stereo matching approaches are an appealing choice for acquiring depth information in a number of video processing applications. It is desirable that these solutions generate dense, robust disparity maps in real time. However, occlusion regions may disturb the applications that need these maps. Among the best of these approaches is the semi-global matching (SGM) technique. This paper presents an FPGA-based stereo vision system based on SGM. This system calculates disparity maps by streaming, which are scalable to several resolutions and disparity ranges. To increase the robustness of the SGM technique even further, the present work has implemented a combination of the gradient filter and the sampling-insensitive absolute difference in the pre-processing phase. Furthermore, as a post-processing step, this paper proposes a novel streaming architecture to detect noisy and occluded regions. The FPGA-based implementations of the proposed stereo matching system in two distinct heterogeneous architecture (GPP—general purpose processor, and FPGA) were evaluated using the Middlebury stereo vision benchmark. The achieved results reported a frame rate of 25 FPS for the disparity maps processing in HD resolution (1024 × 768 pixels), with 256 disparity levels. The results have demonstrated that the memory utilization, processing performance, and accuracy are among the best of FPGA-based stereo vision systems.
ISSN:1861-8200
1861-8219
DOI:10.1007/s11554-019-00902-w