J-MOD2: Joint Monocular Obstacle Detection and Depth Estimation

In this letter, we propose an end-to-end deep architecture that jointly learns to detect obstacles and estimate their depth for MAV flight applications. Most of the existing approaches rely either on Visual simultaneous localization and mapping (SLAM) systems or on depth estimation models to build t...

Full description

Saved in:
Bibliographic Details
Published inIEEE robotics and automation letters Vol. 3; no. 3; pp. 1490 - 1497
Main Authors Mancini, Michele, Costante, Gabriele, Valigi, Paolo, Ciarfuglia, Thomas A.
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 01.07.2018
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this letter, we propose an end-to-end deep architecture that jointly learns to detect obstacles and estimate their depth for MAV flight applications. Most of the existing approaches rely either on Visual simultaneous localization and mapping (SLAM) systems or on depth estimation models to build three-dimensional maps and detect obstacles. However, for the task of avoiding obstacles this level of complexity is not required. Recent works have proposed multitask at its first ocurrence in the text."?> architectures to perform both scene understanding and depth estimation. We follow their path and propose a specific architecture to jointly estimate depth and obstacles, without the need to compute a global map, but maintaining compatibility with a global SLAM system if needed. The network architecture is devised to jointly exploit the information learned from the obstacle detection task, which produces reliable bounding boxes, and the depth estimation one, increasing the robustness of both to scenario changes. We call this architecture J-MOD 2 . We test the effectiveness of our approach with experiments on sequences with different appearance and focal lengths and compare it to SotA multitask methods that perform both semantic segmentation and depth estimation. In addition, we show the integration in a full system using a set of simulated navigation experiments, where a micro aerial vehicle explores an unknown scenario and plans safe trajectories by using our detection model.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2377-3766
DOI:10.1109/LRA.2018.2800083