Toward Domain Independence for Learning-Based Monocular Depth Estimation

Modern autonomous mobile robots require a strong understanding of their surroundings in order to safely operate in cluttered and dynamic environments. Monocular depth estimation offers a geometry-independent paradigm to detect free, navigable space with minimum space, and power consumption. These re...

Full description

Saved in:
Bibliographic Details
Published inIEEE robotics and automation letters Vol. 2; no. 3; pp. 1778 - 1785
Main Authors Mancini, Michele, Costante, Gabriele, Valigi, Paolo, Ciarfuglia, Thomas A., Delmerico, Jeffrey, Scaramuzza, Davide
Format Journal Article
LanguageEnglish
Published IEEE 01.07.2017
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Modern autonomous mobile robots require a strong understanding of their surroundings in order to safely operate in cluttered and dynamic environments. Monocular depth estimation offers a geometry-independent paradigm to detect free, navigable space with minimum space, and power consumption. These represent highly desirable features, especially for microaerial vehicles. In order to guarantee robust operation in real-world scenarios, the estimator is required to generalize well in diverse environments. Most of the existent depth estimators do not consider generalization, and only benchmark their performance on publicly available datasets after specific fine tuning. Generalization can be achieved by training on several heterogeneous datasets, but their collection and labeling is costly. In this letter, we propose a deep neural network for scene depth estimation that is trained on synthetic datasets, which allow inexpensive generation of ground truth data. We show how this approach is able to generalize well across different scenarios. In addition, we show how the addition of long short-term memory layers in the network helps to alleviate, in sequential image streams, some of the intrinsic limitations of monocular vision, such as global scale estimation, with low computational overhead. We demonstrate that the network is able to generalize well with respect to different real-world environments without any fine tuning, achieving comparable performance to state-of-the-art methods on the KITTI dataset.
AbstractList Modern autonomous mobile robots require a strong understanding of their surroundings in order to safely operate in cluttered and dynamic environments. Monocular depth estimation offers a geometry-independent paradigm to detect free, navigable space with minimum space, and power consumption. These represent highly desirable features, especially for microaerial vehicles. In order to guarantee robust operation in real-world scenarios, the estimator is required to generalize well in diverse environments. Most of the existent depth estimators do not consider generalization, and only benchmark their performance on publicly available datasets after specific fine tuning. Generalization can be achieved by training on several heterogeneous datasets, but their collection and labeling is costly. In this letter, we propose a deep neural network for scene depth estimation that is trained on synthetic datasets, which allow inexpensive generation of ground truth data. We show how this approach is able to generalize well across different scenarios. In addition, we show how the addition of long short-term memory layers in the network helps to alleviate, in sequential image streams, some of the intrinsic limitations of monocular vision, such as global scale estimation, with low computational overhead. We demonstrate that the network is able to generalize well with respect to different real-world environments without any fine tuning, achieving comparable performance to state-of-the-art methods on the KITTI dataset.
Author Costante, Gabriele
Mancini, Michele
Ciarfuglia, Thomas A.
Valigi, Paolo
Delmerico, Jeffrey
Scaramuzza, Davide
Author_xml – sequence: 1
  givenname: Michele
  surname: Mancini
  fullname: Mancini, Michele
  email: michele.mancini@unipg.it
  organization: Dept. of Eng., Univ. of Perugia, Perugia, Italy
– sequence: 2
  givenname: Gabriele
  surname: Costante
  fullname: Costante, Gabriele
  email: gabriele.costante@unipg.it
  organization: Dept. of Eng., Univ. of Perugia, Perugia, Italy
– sequence: 3
  givenname: Paolo
  surname: Valigi
  fullname: Valigi, Paolo
  email: paolo.valigi@unipg.it
  organization: Dept. of Eng., Univ. of Perugia, Perugia, Italy
– sequence: 4
  givenname: Thomas A.
  surname: Ciarfuglia
  fullname: Ciarfuglia, Thomas A.
  email: thomas.ciarfuglia@unipg.it
  organization: Dept. of Eng., Univ. of Perugia, Perugia, Italy
– sequence: 5
  givenname: Jeffrey
  surname: Delmerico
  fullname: Delmerico, Jeffrey
  email: jeffdelmerico@ifi.uzh.ch
  organization: Robot. & Perception Group, Univ. of Zurich, Zurich, Switzerland
– sequence: 6
  givenname: Davide
  surname: Scaramuzza
  fullname: Scaramuzza, Davide
  email: sdavide@ifi.uzh.ch
  organization: Robot. & Perception Group, Univ. of Zurich, Zurich, Switzerland
BookMark eNp9kE1LAzEQQINUsNbeBS_5A7vmo5vsHmtbbWFFkHpeYjLRyDZZkhXx37u1RcSDl5lhmDfMvHM08sEDQpeU5JSS6rp-nOeMUJkzUUhC2AkaMy5lxqUQo1_1GZqm9EYIoQWTvCrGaL0NHyoavAw75TzeeAMdDMFrwDZEXIOK3vmX7EYlMPg--KDfWxXxErr-Fa9S73aqd8FfoFOr2gTTY56gp9vVdrHO6oe7zWJeZ5qTos-EeZ5pDkXFpRVGqcrI0laWilIYsFBS0JpzXcLwx9Axs4IZI6llYAwXFPgEicNeHUNKEWyjXf99QR-VaxtKmr2SZlDS7JU0RyUDSP6AXRxuj5__IVcHxAHAz7gsWcWk4F91XW7p
CODEN IRALC6
CitedBy_id crossref_primary_10_1117_1_JEI_34_2_020901
crossref_primary_10_3390_rs12030588
crossref_primary_10_1109_TII_2022_3189428
crossref_primary_10_1109_ACCESS_2024_3432181
crossref_primary_10_1007_s11227_023_05359_0
crossref_primary_10_1016_j_neucom_2020_12_089
crossref_primary_10_1109_LRA_2018_2800083
crossref_primary_10_1109_LRA_2018_2795643
crossref_primary_10_3390_s24237752
crossref_primary_10_1109_LRA_2020_3047796
crossref_primary_10_1145_3672397
crossref_primary_10_1145_3677327
crossref_primary_10_1016_j_jmsy_2022_01_012
crossref_primary_10_1080_01431161_2019_1681601
crossref_primary_10_1177_0142331220947507
crossref_primary_10_1016_j_robot_2020_103701
crossref_primary_10_3390_electronics9060924
crossref_primary_10_1109_ACCESS_2022_3204876
crossref_primary_10_1016_j_ast_2018_11_027
crossref_primary_10_1016_j_paerosci_2020_100617
crossref_primary_10_1080_01691864_2019_1586760
crossref_primary_10_1016_j_jag_2024_103753
crossref_primary_10_1109_ACCESS_2018_2846769
crossref_primary_10_1109_TCI_2020_2981761
crossref_primary_10_1177_0361198120954438
crossref_primary_10_1109_ACCESS_2021_3064758
crossref_primary_10_1016_j_cviu_2022_103489
crossref_primary_10_1364_OL_478375
crossref_primary_10_1016_j_rineng_2022_100636
crossref_primary_10_1109_LRA_2020_3010753
crossref_primary_10_1109_TITS_2022_3219604
crossref_primary_10_1007_s11042_023_15757_4
crossref_primary_10_17341_gazimmfd_979121
crossref_primary_10_3390_s22145353
crossref_primary_10_1109_TITS_2021_3071428
Cites_doi 10.1109/IROS.2015.7353537
10.1007/978-3-319-28872-7_37
10.1023/A:1014573219977
10.1109/ICRA.2015.7138979
10.1109/ICCV.2015.304
10.1109/CVPR.2016.148
10.1109/TPAMI.2007.1166
10.1109/IROS.2015.7353448
10.1109/CVPR.2016.594
10.1109/IROS.2016.7759632
10.1109/ICASSP.2015.7178838
10.1162/neco.1997.9.8.1735
10.1007/978-3-319-46484-8_45
10.1109/ICCV.2011.6126513
10.1109/ICRA.2014.6907233
10.1177/0278364913491297
10.1109/TPAMI.2008.132
10.1109/TPAMI.2015.2505283
ContentType Journal Article
DBID 97E
RIA
RIE
AAYXX
CITATION
DOI 10.1109/LRA.2017.2657002
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2377-3766
EndPage 1785
ExternalDocumentID 10_1109_LRA_2017_2657002
7829276
Genre orig-research
GrantInformation_xml – fundername: M.I.U.R. (Minstero dell’Istruzione dell’Università e della Ricerca)
  grantid: SCN_398/SEAL
– fundername: DARPA FLA Program
GroupedDBID 0R~
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFS
AGQYO
AGSQL
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
IFIPE
IPLJI
JAVBF
KQ8
M43
M~E
O9-
OCL
RIA
RIE
AAYXX
CITATION
RIG
ID FETCH-LOGICAL-c305t-6db4c3e5937f6daa9d78f9f1686defe81ecc33c8e5706ded452dd71f2edd361e3
IEDL.DBID RIE
ISSN 2377-3766
IngestDate Tue Jul 01 03:53:52 EDT 2025
Thu Apr 24 22:53:59 EDT 2025
Tue Aug 26 16:57:02 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 3
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c305t-6db4c3e5937f6daa9d78f9f1686defe81ecc33c8e5706ded452dd71f2edd361e3
OpenAccessLink https://www.zora.uzh.ch/id/eprint/138918/1/RAL17_Mancini.pdf
PageCount 8
ParticipantIDs crossref_citationtrail_10_1109_LRA_2017_2657002
crossref_primary_10_1109_LRA_2017_2657002
ieee_primary_7829276
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2017-July
2017-7-00
PublicationDateYYYYMMDD 2017-07-01
PublicationDate_xml – month: 07
  year: 2017
  text: 2017-July
PublicationDecade 2010
PublicationTitle IEEE robotics and automation letters
PublicationTitleAbbrev LRA
PublicationYear 2017
Publisher IEEE
Publisher_xml – name: IEEE
References ref13
ref12
ref14
ref31
deng (ref22) 0
ref10
simonyan (ref21) 2014
ref1
ref16
ref18
pinggera (ref3) 2014
eigen (ref7) 0
richter (ref2) 2016
fischer (ref19) 2015
badrinarayanan (ref20) 2015
ref24
davies (ref4) 2004
ref26
silberman (ref30) 2012
engel (ref15) 2014
ref28
ref27
chatfield (ref23) 2014
sharma (ref25) 2015
ref8
(ref29) 0
saxena (ref17) 0
ref9
garg (ref11) 2016
ref6
ref5
References_xml – start-page: 834
  year: 2014
  ident: ref15
  article-title: LSD-SLAM: Large-scale direct monocular slam
  publication-title: European Conference on Computer Vision
– year: 2014
  ident: ref23
  article-title: Return of the devil in the details: Delving deep into convolutional nets
  publication-title: arXiv preprint arXiv 1405 3531
– ident: ref5
  doi: 10.1109/IROS.2015.7353537
– start-page: 649
  year: 2016
  ident: ref2
  article-title: Polynomial trajectory planning for aggressive quadrotor flight in dense indoor environments
  publication-title: Robotics Research
  doi: 10.1007/978-3-319-28872-7_37
– ident: ref13
  doi: 10.1023/A:1014573219977
– year: 2014
  ident: ref21
  article-title: Very deep convolutional networks for large-scale image recognition
  publication-title: arXiv preprint arXiv 1409 1556
– year: 2015
  ident: ref20
  article-title: Segnet: A deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling
  publication-title: arXiv preprint arXiv 1505 03561
– ident: ref1
  doi: 10.1109/ICRA.2015.7138979
– year: 2004
  ident: ref4
  publication-title: Machine Vision Theory Algorithms Practicalities
– ident: ref18
  doi: 10.1109/ICCV.2015.304
– start-page: 746
  year: 2012
  ident: ref30
  article-title: Indoor segmentation and support inference from RGBD images
  publication-title: Computer Vision
– year: 2015
  ident: ref19
  article-title: Flownet: Learning optical flow with convolutional networks
– ident: ref26
  doi: 10.1109/CVPR.2016.148
– ident: ref31
  doi: 10.1109/TPAMI.2007.1166
– ident: ref6
  doi: 10.1109/IROS.2015.7353448
– ident: ref9
  doi: 10.1109/CVPR.2016.594
– ident: ref12
  doi: 10.1109/IROS.2016.7759632
– ident: ref24
  doi: 10.1109/ICASSP.2015.7178838
– start-page: 1161
  year: 0
  ident: ref17
  article-title: Learning depth from single monocular images
  publication-title: Proc Adv Neural Inf Process Syst
– ident: ref27
  doi: 10.1162/neco.1997.9.8.1735
– year: 2016
  ident: ref11
  article-title: Unsupervised CNN for single view depth estimation: Geometry to the rescue
  doi: 10.1007/978-3-319-46484-8_45
– ident: ref16
  doi: 10.1109/ICCV.2011.6126513
– ident: ref14
  doi: 10.1109/ICRA.2014.6907233
– start-page: 2366
  year: 0
  ident: ref7
  article-title: Depth map prediction from a single image using a multi-scale deep network
  publication-title: Proc Adv Neural Inf Process Syst
– ident: ref28
  doi: 10.1177/0278364913491297
– start-page: 96
  year: 2014
  ident: ref3
  article-title: Know your limits: Accuracy of long range stereoscopic object measurements in practice
  publication-title: Computer Vision
– year: 0
  ident: ref29
– ident: ref10
  doi: 10.1109/TPAMI.2008.132
– ident: ref8
  doi: 10.1109/TPAMI.2015.2505283
– year: 2015
  ident: ref25
  article-title: Action recognition using visual attention
  publication-title: arXiv preprint arXiv 1511 05271
– start-page: 248
  year: 0
  ident: ref22
  article-title: Imagenet: A large-scale hierarchical image database
  publication-title: Proc IEEE Conf Comput Vis Pattern Recog
SSID ssj0001527395
Score 2.318378
Snippet Modern autonomous mobile robots require a strong understanding of their surroundings in order to safely operate in cluttered and dynamic environments....
SourceID crossref
ieee
SourceType Enrichment Source
Index Database
Publisher
StartPage 1778
SubjectTerms Benchmark testing
Cameras
Collision avoidance
Estimation
Feature extraction
range sensing
Streaming media
Training
Vehicles
visual-based navigation
Title Toward Domain Independence for Learning-Based Monocular Depth Estimation
URI https://ieeexplore.ieee.org/document/7829276
Volume 2
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8NAEB7anvTgq4r1UfbgRTBpnpvkWLWlivUgLfQWkt2JipoUSS8e_O3O5mUREW9hmIVlZjfz2G9mAM4wVrl8aWhSmrHmoMk1ojiaZ0Q2T6SIhKFqh6f3fDJ3bhfuogUXTS0MIhbgM9TVZ_GWLzOxUqmyAVmzwPJ4G9oUuJW1Wt_5FNVJLHDrl0gjGNw9DBV0y9MtBe-o8ia15VkbpVJYkvE2TOs9lACSF32Vx7r4-NGe8b-b3IGtyqVkw_IM7EIL0z3YXGs02IXJrEDHsuvsLXpO2U0z-1YgI6-VVV1WH7VLMmqS0UXPCnwqu8Zl_sRG9B8oSxz3YT4eza4mWjVDQRN0k3ONy9gRNrrkhSRcRlEgPT8JEpP7XGKCvkkqtG3hI4mIKNJxLSk9M7FQSpubaB9AJ81SPARmU6QSWL4piMdxEtv3AovCM07sccSF7MGglm8oqgbjas7Fa1gEGkYQkkZCpZGw0kgPzpsVy7K5xh-8XSXrhq8S89Hv5GPYUItLXO0JdPL3FZ6S95DHfWhPP0f94vB8AftVw9o
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEJ4gHtSDLzTicw9eTCz0uW2PKJKiwMFAwq1pd6dq1JaYcvHXO9sWJMYYb81k2mxmdjuP_WYG4BJjlcuXuialEWs2Glwjiq25emTxRIpI6Kp2eDjiwcS-nzrTGlwva2EQsQCfYUs9Fnf5MhNzlSprkzXzTZevwTrZfccsq7W-Myqql5jvLO4idb89eOwo8JbbMhXAo8qcLGzPyjCVwpb0dmC4WEUJIXltzfO4JT5_NGj87zJ3YbtyKlmn3AV7UMN0H7ZWWg02IBgX-FjWzd6jl5T1l9NvBTLyW1nVZ_VJuyGzJhkd9axAqLIuzvJndkd_grLI8QAmvbvxbaBVUxQ0QWc517iMbWGhQ35IwmUU-dL1Ej8xuMclJugZpETLEh6SiIgibceU0jUSE6W0uIHWIdTTLMUjYBbFKr7pGYJ4bDuxPNc3KUDjxB5HXMgmtBfyDUXVYlxNungLi1BD90PSSKg0ElYaacLV8o1Z2V7jD96GkvWSrxLz8e_kC9gIxsNBOOiPHk5gU32oRNmeQj3_mOMZ-RJ5fF5soS--isXz
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Toward+Domain+Independence+for+Learning-Based+Monocular+Depth+Estimation&rft.jtitle=IEEE+robotics+and+automation+letters&rft.au=Mancini%2C+Michele&rft.au=Costante%2C+Gabriele&rft.au=Valigi%2C+Paolo&rft.au=Ciarfuglia%2C+Thomas+A.&rft.date=2017-07-01&rft.issn=2377-3766&rft.eissn=2377-3766&rft.volume=2&rft.issue=3&rft.spage=1778&rft.epage=1785&rft_id=info:doi/10.1109%2FLRA.2017.2657002&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_LRA_2017_2657002
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2377-3766&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2377-3766&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2377-3766&client=summon