J-MOD2: Joint Monocular Obstacle Detection and Depth Estimation

In this letter, we propose an end-to-end deep architecture that jointly learns to detect obstacles and estimate their depth for MAV flight applications. Most of the existing approaches rely either on Visual simultaneous localization and mapping (SLAM) systems or on depth estimation models to build t...

Full description

Saved in:

Bibliographic Details
Published in	IEEE robotics and automation letters Vol. 3; no. 3; pp. 1490 - 1497
Main Authors	Mancini, Michele, Costante, Gabriele, Valigi, Paolo, Ciarfuglia, Thomas A.
Format	Journal Article
Language	English
Published	Piscataway IEEE 01.07.2018 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Barriers Cameras Computer architecture Computer simulation Estimation Feature extraction Micro air vehicles (MAV) Obstacle avoidance Range sensing Robustness Scene analysis Semantic segmentation Simultaneous localization and mapping Task analysis Three dimensional models Three-dimensional displays Visual flight visual learning visual-based navigation
Online Access	Get full text
ISSN	2377-3766
DOI	10.1109/LRA.2018.2800083

Cover

Loading…

Abstract	In this letter, we propose an end-to-end deep architecture that jointly learns to detect obstacles and estimate their depth for MAV flight applications. Most of the existing approaches rely either on Visual simultaneous localization and mapping (SLAM) systems or on depth estimation models to build three-dimensional maps and detect obstacles. However, for the task of avoiding obstacles this level of complexity is not required. Recent works have proposed multitask at its first ocurrence in the text."?> architectures to perform both scene understanding and depth estimation. We follow their path and propose a specific architecture to jointly estimate depth and obstacles, without the need to compute a global map, but maintaining compatibility with a global SLAM system if needed. The network architecture is devised to jointly exploit the information learned from the obstacle detection task, which produces reliable bounding boxes, and the depth estimation one, increasing the robustness of both to scenario changes. We call this architecture J-MOD 2 . We test the effectiveness of our approach with experiments on sequences with different appearance and focal lengths and compare it to SotA multitask methods that perform both semantic segmentation and depth estimation. In addition, we show the integration in a full system using a set of simulated navigation experiments, where a micro aerial vehicle explores an unknown scenario and plans safe trajectories by using our detection model.
AbstractList	In this letter, we propose an end-to-end deep architecture that jointly learns to detect obstacles and estimate their depth for MAV flight applications. Most of the existing approaches rely either on Visual simultaneous localization and mapping (SLAM) systems or on depth estimation models to build three-dimensional maps and detect obstacles. However, for the task of avoiding obstacles this level of complexity is not required. Recent works have proposed multitask at its first ocurrence in the text."?> architectures to perform both scene understanding and depth estimation. We follow their path and propose a specific architecture to jointly estimate depth and obstacles, without the need to compute a global map, but maintaining compatibility with a global SLAM system if needed. The network architecture is devised to jointly exploit the information learned from the obstacle detection task, which produces reliable bounding boxes, and the depth estimation one, increasing the robustness of both to scenario changes. We call this architecture J-MOD 2 . We test the effectiveness of our approach with experiments on sequences with different appearance and focal lengths and compare it to SotA multitask methods that perform both semantic segmentation and depth estimation. In addition, we show the integration in a full system using a set of simulated navigation experiments, where a micro aerial vehicle explores an unknown scenario and plans safe trajectories by using our detection model. In this letter, we propose an end-to-end deep architecture that jointly learns to detect obstacles and estimate their depth for MAV flight applications. Most of the existing approaches rely either on Visual simultaneous localization and mapping (SLAM) systems or on depth estimation models to build three-dimensional maps and detect obstacles. However, for the task of avoiding obstacles this level of complexity is not required. Recent works have proposed multitask at its first ocurrence in the text."?> architectures to perform both scene understanding and depth estimation. We follow their path and propose a specific architecture to jointly estimate depth and obstacles, without the need to compute a global map, but maintaining compatibility with a global SLAM system if needed. The network architecture is devised to jointly exploit the information learned from the obstacle detection task, which produces reliable bounding boxes, and the depth estimation one, increasing the robustness of both to scenario changes. We call this architecture J-MOD2. We test the effectiveness of our approach with experiments on sequences with different appearance and focal lengths and compare it to SotA multitask methods that perform both semantic segmentation and depth estimation. In addition, we show the integration in a full system using a set of simulated navigation experiments, where a micro aerial vehicle explores an unknown scenario and plans safe trajectories by using our detection model.
Author	Costante, Gabriele Mancini, Michele Ciarfuglia, Thomas A. Valigi, Paolo
Author_xml	– sequence: 1 givenname: Michele orcidid: 0000-0003-0845-2813 surname: Mancini fullname: Mancini, Michele email: michele.mancini@unipg.it organization: Department of Engineering, University of Perugia, Perugia, Italy – sequence: 2 givenname: Gabriele orcidid: 0000-0002-8417-9372 surname: Costante fullname: Costante, Gabriele email: gabriele.costante@unipg.it organization: Department of Engineering, University of Perugia, Perugia, Italy – sequence: 3 givenname: Paolo orcidid: 0000-0002-0486-7678 surname: Valigi fullname: Valigi, Paolo email: paolo.valigi@unipg.it organization: Department of Engineering, University of Perugia, Perugia, Italy – sequence: 4 givenname: Thomas A. orcidid: 0000-0001-8646-8197 surname: Ciarfuglia fullname: Ciarfuglia, Thomas A. email: thomas.ciarfuglia@unipg.it organization: Department of Engineering, University of Perugia, Perugia, Italy
BookMark	eNotjs9LwzAcxYMoOOfugpeA5858k-aXFxnb_DE6CqLnkmYpdtSktunB_97IPD3e48N77wqd--AdQjdAlgBE3xdvqyUloJZUEUIUO0MzyqTMmBTiEi3G8Zhi4FQyzWfocZftyw19wLvQ-oj3wQc7dWbAZT1GYzuHNy46G9vgsfGH5Pr4ibdjbL_MX3iNLhrTjW7xr3P08bR9X79kRfn8ul4VWQtax4waYcBJsJwyI8WBCKNzapkwEoDmNZGNYE7IWgtirXG50hI4z0E3vG64YHN0d-rth_A9uTFWxzANPk1WlGrNUg1Aom5PVOucq_ohnRx-KkWl4IqwXxwoUiU
CODEN	IRALC6
ContentType	Journal Article
Copyright	Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2018
Copyright_xml	– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2018
DBID	97E RIA RIE 7SC 7SP 8FD JQ2 L7M L~C L~D
DOI	10.1109/LRA.2018.2800083
DatabaseName	IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional
DatabaseTitle	Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional
DatabaseTitleList	Technology Research Database
Database_xml	– sequence: 1 dbid: RIE name: IEL url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	2377-3766
EndPage	1497
ExternalDocumentID	8276580
Genre	orig-research
GrantInformation_xml	– fundername: NVIDIA funderid: 10.13039/100007065
GroupedDBID	0R~ 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFS AGQYO AGSQL AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD IFIPE IPLJI JAVBF KQ8 M43 M~E O9- OCL RIA RIE 7SC 7SP 8FD JQ2 L7M L~C L~D RIG
ID	FETCH-LOGICAL-i199t-2a6a1e71c523a76d06a942c36a71124b07f63e67b960ccae4897155419f5bf563
IEDL.DBID	RIE
IngestDate	Sun Jun 29 15:44:22 EDT 2025 Wed Aug 27 08:32:58 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Issue	3
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i199t-2a6a1e71c523a76d06a942c36a71124b07f63e67b960ccae4897155419f5bf563
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0002-0486-7678 0000-0002-8417-9372 0000-0003-0845-2813 0000-0001-8646-8197
PQID	2299371111
PQPubID	4437225
PageCount	8
ParticipantIDs	ieee_primary_8276580 proquest_journals_2299371111
PublicationCentury	2000
PublicationDate	2018-07-01
PublicationDateYYYYMMDD	2018-07-01
PublicationDate_xml	– month: 07 year: 2018 text: 2018-07-01 day: 01
PublicationDecade	2010
PublicationPlace	Piscataway
PublicationPlace_xml	– name: Piscataway
PublicationTitle	IEEE robotics and automation letters
PublicationTitleAbbrev	LRA
PublicationYear	2018
Publisher	IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml	– name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
SSID	ssj0001527395
Score	2.2813122
Snippet	In this letter, we propose an end-to-end deep architecture that jointly learns to detect obstacles and estimate their depth for MAV flight applications. Most...
SourceID	proquest ieee
SourceType	Aggregation Database Publisher
StartPage	1490
SubjectTerms	Barriers Cameras Computer architecture Computer simulation Estimation Feature extraction Micro air vehicles (MAV) Obstacle avoidance Range sensing Robustness Scene analysis Semantic segmentation Simultaneous localization and mapping Task analysis Three dimensional models Three-dimensional displays Visual flight visual learning visual-based navigation
Title	J-MOD2: Joint Monocular Obstacle Detection and Depth Estimation
URI	https://ieeexplore.ieee.org/document/8276580 https://www.proquest.com/docview/2299371111
Volume	3
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELbaTjDwKohCqTww4jR2Ej9YUEVbVRWhEqJSt8iOHVEhpRWkCwO_HTtJAQEDmz1Ysn0-33fn784AXAYh01z5HGWRVCgMGEZKa4lERCIieYZT3yU4x_d0Mg-ni2jRAFefuTDGmJJ8ZjzXLN_y9SrduFBZnxNmDaZ10JvWcatytb7iKa6SmIi2L5G-6N89DBx1i3uEl0ij_j_l16VbWpLxPoi3c6gIJM_eplBe-vajPON_J3kA9mpICQfVGTgEDZMfgd1vhQbb4GaK4tmQXMPpapkX0CryquSfwpmy6NCOgkNTlKSsHMpc2966eIIjq_5VZuMxmI9Hj7cTVH-dgJZYiAIRSSU2DKfWz5SMap9KEZI0oJJZgBUqn2U0MJQp68BYGZqQC-aQBRZZpLKIBiegla9ycwogxgqn2CilrbUPtOBpmsnAcWm4VDzjHdB2m5Csq-oYSb3-Duhutzmp1eI1IcTBIXdLn_096hzsOJFVMY4uaBUvG3NhrX6heqAZv496pdA_AFMAq7I
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NT8IwFG8IHtSDX2hEUXvwaMfabf3wYohAEBkkBhJuS7t1kZgMouPiX2-7DTXqwVt7aPLa1773e-3vvQJw7fks4crlKA2kQr7HMFJJIpEISEAkT3Hs2gTncEwHM384D-Y1cPOZC6O1Lshn2rHN4i0_WcZre1XW5oQZh2kC9C3j9wNcZmt93ajYWmIi2LxFuqI9eupY8hZ3CC-wRvWDyi-zW_iS_j4IN1KUFJIXZ50rJ37_UaDxv2IegL0KVMJOuQsOQU1nR2D3W6nBBrgbonDSJbdwuFxkOTRHeVkwUOFEGXxoRsGuzgtaVgZllpjeKn-GPWMAytzGYzDr96b3A1R9noAWWIgcEUkl1gzHJtKUjCYulcInsUclMxDLVy5LqacpUyaEMVrUPhfMYgss0kClAfVOQD1bZvoUQIwVjrFWKjH-3ksEj-NUepZNw6XiKW-Chl2EaFXWx4iq-TdBa7PMUXUw3iJCLCCydvrs71FXYHswDUfR6GH8eA52rPpKdmwL1PPXtb4wGCBXl4XqPwBoSK3Q
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=J-MOD2%3A+Joint+Monocular+Obstacle+Detection+and+Depth+Estimation&rft.jtitle=IEEE+robotics+and+automation+letters&rft.au=Mancini%2C+Michele&rft.au=Costante%2C+Gabriele&rft.au=Valigi%2C+Paolo&rft.au=Ciarfuglia%2C+Thomas+A&rft.date=2018-07-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.eissn=2377-3766&rft.volume=3&rft.issue=3&rft.spage=1490&rft_id=info:doi/10.1109%2FLRA.2018.2800083&rft.externalDBID=NO_FULL_TEXT