Toward Domain Independence for Learning-Based Monocular Depth Estimation
Modern autonomous mobile robots require a strong understanding of their surroundings in order to safely operate in cluttered and dynamic environments. Monocular depth estimation offers a geometry-independent paradigm to detect free, navigable space with minimum space, and power consumption. These re...
Saved in:
Published in | IEEE robotics and automation letters Vol. 2; no. 3; pp. 1778 - 1785 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
IEEE
01.07.2017
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Modern autonomous mobile robots require a strong understanding of their surroundings in order to safely operate in cluttered and dynamic environments. Monocular depth estimation offers a geometry-independent paradigm to detect free, navigable space with minimum space, and power consumption. These represent highly desirable features, especially for microaerial vehicles. In order to guarantee robust operation in real-world scenarios, the estimator is required to generalize well in diverse environments. Most of the existent depth estimators do not consider generalization, and only benchmark their performance on publicly available datasets after specific fine tuning. Generalization can be achieved by training on several heterogeneous datasets, but their collection and labeling is costly. In this letter, we propose a deep neural network for scene depth estimation that is trained on synthetic datasets, which allow inexpensive generation of ground truth data. We show how this approach is able to generalize well across different scenarios. In addition, we show how the addition of long short-term memory layers in the network helps to alleviate, in sequential image streams, some of the intrinsic limitations of monocular vision, such as global scale estimation, with low computational overhead. We demonstrate that the network is able to generalize well with respect to different real-world environments without any fine tuning, achieving comparable performance to state-of-the-art methods on the KITTI dataset. |
---|---|
AbstractList | Modern autonomous mobile robots require a strong understanding of their surroundings in order to safely operate in cluttered and dynamic environments. Monocular depth estimation offers a geometry-independent paradigm to detect free, navigable space with minimum space, and power consumption. These represent highly desirable features, especially for microaerial vehicles. In order to guarantee robust operation in real-world scenarios, the estimator is required to generalize well in diverse environments. Most of the existent depth estimators do not consider generalization, and only benchmark their performance on publicly available datasets after specific fine tuning. Generalization can be achieved by training on several heterogeneous datasets, but their collection and labeling is costly. In this letter, we propose a deep neural network for scene depth estimation that is trained on synthetic datasets, which allow inexpensive generation of ground truth data. We show how this approach is able to generalize well across different scenarios. In addition, we show how the addition of long short-term memory layers in the network helps to alleviate, in sequential image streams, some of the intrinsic limitations of monocular vision, such as global scale estimation, with low computational overhead. We demonstrate that the network is able to generalize well with respect to different real-world environments without any fine tuning, achieving comparable performance to state-of-the-art methods on the KITTI dataset. |
Author | Costante, Gabriele Mancini, Michele Ciarfuglia, Thomas A. Valigi, Paolo Delmerico, Jeffrey Scaramuzza, Davide |
Author_xml | – sequence: 1 givenname: Michele surname: Mancini fullname: Mancini, Michele email: michele.mancini@unipg.it organization: Dept. of Eng., Univ. of Perugia, Perugia, Italy – sequence: 2 givenname: Gabriele surname: Costante fullname: Costante, Gabriele email: gabriele.costante@unipg.it organization: Dept. of Eng., Univ. of Perugia, Perugia, Italy – sequence: 3 givenname: Paolo surname: Valigi fullname: Valigi, Paolo email: paolo.valigi@unipg.it organization: Dept. of Eng., Univ. of Perugia, Perugia, Italy – sequence: 4 givenname: Thomas A. surname: Ciarfuglia fullname: Ciarfuglia, Thomas A. email: thomas.ciarfuglia@unipg.it organization: Dept. of Eng., Univ. of Perugia, Perugia, Italy – sequence: 5 givenname: Jeffrey surname: Delmerico fullname: Delmerico, Jeffrey email: jeffdelmerico@ifi.uzh.ch organization: Robot. & Perception Group, Univ. of Zurich, Zurich, Switzerland – sequence: 6 givenname: Davide surname: Scaramuzza fullname: Scaramuzza, Davide email: sdavide@ifi.uzh.ch organization: Robot. & Perception Group, Univ. of Zurich, Zurich, Switzerland |
BookMark | eNp9kE1LAzEQQINUsNbeBS_5A7vmo5vsHmtbbWFFkHpeYjLRyDZZkhXx37u1RcSDl5lhmDfMvHM08sEDQpeU5JSS6rp-nOeMUJkzUUhC2AkaMy5lxqUQo1_1GZqm9EYIoQWTvCrGaL0NHyoavAw75TzeeAMdDMFrwDZEXIOK3vmX7EYlMPg--KDfWxXxErr-Fa9S73aqd8FfoFOr2gTTY56gp9vVdrHO6oe7zWJeZ5qTos-EeZ5pDkXFpRVGqcrI0laWilIYsFBS0JpzXcLwx9Axs4IZI6llYAwXFPgEicNeHUNKEWyjXf99QR-VaxtKmr2SZlDS7JU0RyUDSP6AXRxuj5__IVcHxAHAz7gsWcWk4F91XW7p |
CODEN | IRALC6 |
CitedBy_id | crossref_primary_10_1117_1_JEI_34_2_020901 crossref_primary_10_3390_rs12030588 crossref_primary_10_1109_TII_2022_3189428 crossref_primary_10_1109_ACCESS_2024_3432181 crossref_primary_10_1007_s11227_023_05359_0 crossref_primary_10_1016_j_neucom_2020_12_089 crossref_primary_10_1109_LRA_2018_2800083 crossref_primary_10_1109_LRA_2018_2795643 crossref_primary_10_3390_s24237752 crossref_primary_10_1109_LRA_2020_3047796 crossref_primary_10_1145_3672397 crossref_primary_10_1145_3677327 crossref_primary_10_1016_j_jmsy_2022_01_012 crossref_primary_10_1080_01431161_2019_1681601 crossref_primary_10_1177_0142331220947507 crossref_primary_10_1016_j_robot_2020_103701 crossref_primary_10_3390_electronics9060924 crossref_primary_10_1109_ACCESS_2022_3204876 crossref_primary_10_1016_j_ast_2018_11_027 crossref_primary_10_1016_j_paerosci_2020_100617 crossref_primary_10_1080_01691864_2019_1586760 crossref_primary_10_1016_j_jag_2024_103753 crossref_primary_10_1109_ACCESS_2018_2846769 crossref_primary_10_1109_TCI_2020_2981761 crossref_primary_10_1177_0361198120954438 crossref_primary_10_1109_ACCESS_2021_3064758 crossref_primary_10_1016_j_cviu_2022_103489 crossref_primary_10_1364_OL_478375 crossref_primary_10_1016_j_rineng_2022_100636 crossref_primary_10_1109_LRA_2020_3010753 crossref_primary_10_1109_TITS_2022_3219604 crossref_primary_10_1007_s11042_023_15757_4 crossref_primary_10_17341_gazimmfd_979121 crossref_primary_10_3390_s22145353 crossref_primary_10_1109_TITS_2021_3071428 |
Cites_doi | 10.1109/IROS.2015.7353537 10.1007/978-3-319-28872-7_37 10.1023/A:1014573219977 10.1109/ICRA.2015.7138979 10.1109/ICCV.2015.304 10.1109/CVPR.2016.148 10.1109/TPAMI.2007.1166 10.1109/IROS.2015.7353448 10.1109/CVPR.2016.594 10.1109/IROS.2016.7759632 10.1109/ICASSP.2015.7178838 10.1162/neco.1997.9.8.1735 10.1007/978-3-319-46484-8_45 10.1109/ICCV.2011.6126513 10.1109/ICRA.2014.6907233 10.1177/0278364913491297 10.1109/TPAMI.2008.132 10.1109/TPAMI.2015.2505283 |
ContentType | Journal Article |
DBID | 97E RIA RIE AAYXX CITATION |
DOI | 10.1109/LRA.2017.2657002 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 2377-3766 |
EndPage | 1785 |
ExternalDocumentID | 10_1109_LRA_2017_2657002 7829276 |
Genre | orig-research |
GrantInformation_xml | – fundername: M.I.U.R. (Minstero dell’Istruzione dell’Università e della Ricerca) grantid: SCN_398/SEAL – fundername: DARPA FLA Program |
GroupedDBID | 0R~ 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFS AGQYO AGSQL AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD IFIPE IPLJI JAVBF KQ8 M43 M~E O9- OCL RIA RIE AAYXX CITATION RIG |
ID | FETCH-LOGICAL-c305t-6db4c3e5937f6daa9d78f9f1686defe81ecc33c8e5706ded452dd71f2edd361e3 |
IEDL.DBID | RIE |
ISSN | 2377-3766 |
IngestDate | Tue Jul 01 03:53:52 EDT 2025 Thu Apr 24 22:53:59 EDT 2025 Tue Aug 26 16:57:02 EDT 2025 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 3 |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c305t-6db4c3e5937f6daa9d78f9f1686defe81ecc33c8e5706ded452dd71f2edd361e3 |
OpenAccessLink | https://www.zora.uzh.ch/id/eprint/138918/1/RAL17_Mancini.pdf |
PageCount | 8 |
ParticipantIDs | crossref_citationtrail_10_1109_LRA_2017_2657002 crossref_primary_10_1109_LRA_2017_2657002 ieee_primary_7829276 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2017-July 2017-7-00 |
PublicationDateYYYYMMDD | 2017-07-01 |
PublicationDate_xml | – month: 07 year: 2017 text: 2017-July |
PublicationDecade | 2010 |
PublicationTitle | IEEE robotics and automation letters |
PublicationTitleAbbrev | LRA |
PublicationYear | 2017 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
References | ref13 ref12 ref14 ref31 deng (ref22) 0 ref10 simonyan (ref21) 2014 ref1 ref16 ref18 pinggera (ref3) 2014 eigen (ref7) 0 richter (ref2) 2016 fischer (ref19) 2015 badrinarayanan (ref20) 2015 ref24 davies (ref4) 2004 ref26 silberman (ref30) 2012 engel (ref15) 2014 ref28 ref27 chatfield (ref23) 2014 sharma (ref25) 2015 ref8 (ref29) 0 saxena (ref17) 0 ref9 garg (ref11) 2016 ref6 ref5 |
References_xml | – start-page: 834 year: 2014 ident: ref15 article-title: LSD-SLAM: Large-scale direct monocular slam publication-title: European Conference on Computer Vision – year: 2014 ident: ref23 article-title: Return of the devil in the details: Delving deep into convolutional nets publication-title: arXiv preprint arXiv 1405 3531 – ident: ref5 doi: 10.1109/IROS.2015.7353537 – start-page: 649 year: 2016 ident: ref2 article-title: Polynomial trajectory planning for aggressive quadrotor flight in dense indoor environments publication-title: Robotics Research doi: 10.1007/978-3-319-28872-7_37 – ident: ref13 doi: 10.1023/A:1014573219977 – year: 2014 ident: ref21 article-title: Very deep convolutional networks for large-scale image recognition publication-title: arXiv preprint arXiv 1409 1556 – year: 2015 ident: ref20 article-title: Segnet: A deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling publication-title: arXiv preprint arXiv 1505 03561 – ident: ref1 doi: 10.1109/ICRA.2015.7138979 – year: 2004 ident: ref4 publication-title: Machine Vision Theory Algorithms Practicalities – ident: ref18 doi: 10.1109/ICCV.2015.304 – start-page: 746 year: 2012 ident: ref30 article-title: Indoor segmentation and support inference from RGBD images publication-title: Computer Vision – year: 2015 ident: ref19 article-title: Flownet: Learning optical flow with convolutional networks – ident: ref26 doi: 10.1109/CVPR.2016.148 – ident: ref31 doi: 10.1109/TPAMI.2007.1166 – ident: ref6 doi: 10.1109/IROS.2015.7353448 – ident: ref9 doi: 10.1109/CVPR.2016.594 – ident: ref12 doi: 10.1109/IROS.2016.7759632 – ident: ref24 doi: 10.1109/ICASSP.2015.7178838 – start-page: 1161 year: 0 ident: ref17 article-title: Learning depth from single monocular images publication-title: Proc Adv Neural Inf Process Syst – ident: ref27 doi: 10.1162/neco.1997.9.8.1735 – year: 2016 ident: ref11 article-title: Unsupervised CNN for single view depth estimation: Geometry to the rescue doi: 10.1007/978-3-319-46484-8_45 – ident: ref16 doi: 10.1109/ICCV.2011.6126513 – ident: ref14 doi: 10.1109/ICRA.2014.6907233 – start-page: 2366 year: 0 ident: ref7 article-title: Depth map prediction from a single image using a multi-scale deep network publication-title: Proc Adv Neural Inf Process Syst – ident: ref28 doi: 10.1177/0278364913491297 – start-page: 96 year: 2014 ident: ref3 article-title: Know your limits: Accuracy of long range stereoscopic object measurements in practice publication-title: Computer Vision – year: 0 ident: ref29 – ident: ref10 doi: 10.1109/TPAMI.2008.132 – ident: ref8 doi: 10.1109/TPAMI.2015.2505283 – year: 2015 ident: ref25 article-title: Action recognition using visual attention publication-title: arXiv preprint arXiv 1511 05271 – start-page: 248 year: 0 ident: ref22 article-title: Imagenet: A large-scale hierarchical image database publication-title: Proc IEEE Conf Comput Vis Pattern Recog |
SSID | ssj0001527395 |
Score | 2.318378 |
Snippet | Modern autonomous mobile robots require a strong understanding of their surroundings in order to safely operate in cluttered and dynamic environments.... |
SourceID | crossref ieee |
SourceType | Enrichment Source Index Database Publisher |
StartPage | 1778 |
SubjectTerms | Benchmark testing Cameras Collision avoidance Estimation Feature extraction range sensing Streaming media Training Vehicles visual-based navigation |
Title | Toward Domain Independence for Learning-Based Monocular Depth Estimation |
URI | https://ieeexplore.ieee.org/document/7829276 |
Volume | 2 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8NAEB7anvTgq4r1UfbgRTBpnpvkWLWlivUgLfQWkt2JipoUSS8e_O3O5mUREW9hmIVlZjfz2G9mAM4wVrl8aWhSmrHmoMk1ojiaZ0Q2T6SIhKFqh6f3fDJ3bhfuogUXTS0MIhbgM9TVZ_GWLzOxUqmyAVmzwPJ4G9oUuJW1Wt_5FNVJLHDrl0gjGNw9DBV0y9MtBe-o8ia15VkbpVJYkvE2TOs9lACSF32Vx7r4-NGe8b-b3IGtyqVkw_IM7EIL0z3YXGs02IXJrEDHsuvsLXpO2U0z-1YgI6-VVV1WH7VLMmqS0UXPCnwqu8Zl_sRG9B8oSxz3YT4eza4mWjVDQRN0k3ONy9gRNrrkhSRcRlEgPT8JEpP7XGKCvkkqtG3hI4mIKNJxLSk9M7FQSpubaB9AJ81SPARmU6QSWL4piMdxEtv3AovCM07sccSF7MGglm8oqgbjas7Fa1gEGkYQkkZCpZGw0kgPzpsVy7K5xh-8XSXrhq8S89Hv5GPYUItLXO0JdPL3FZ6S95DHfWhPP0f94vB8AftVw9o |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEJ4gHtSDLzTicw9eTCz0uW2PKJKiwMFAwq1pd6dq1JaYcvHXO9sWJMYYb81k2mxmdjuP_WYG4BJjlcuXuialEWs2Glwjiq25emTxRIpI6Kp2eDjiwcS-nzrTGlwva2EQsQCfYUs9Fnf5MhNzlSprkzXzTZevwTrZfccsq7W-Myqql5jvLO4idb89eOwo8JbbMhXAo8qcLGzPyjCVwpb0dmC4WEUJIXltzfO4JT5_NGj87zJ3YbtyKlmn3AV7UMN0H7ZWWg02IBgX-FjWzd6jl5T1l9NvBTLyW1nVZ_VJuyGzJhkd9axAqLIuzvJndkd_grLI8QAmvbvxbaBVUxQ0QWc517iMbWGhQ35IwmUU-dL1Ej8xuMclJugZpETLEh6SiIgibceU0jUSE6W0uIHWIdTTLMUjYBbFKr7pGYJ4bDuxPNc3KUDjxB5HXMgmtBfyDUXVYlxNungLi1BD90PSSKg0ElYaacLV8o1Z2V7jD96GkvWSrxLz8e_kC9gIxsNBOOiPHk5gU32oRNmeQj3_mOMZ-RJ5fF5soS--isXz |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Toward+Domain+Independence+for+Learning-Based+Monocular+Depth+Estimation&rft.jtitle=IEEE+robotics+and+automation+letters&rft.au=Mancini%2C+Michele&rft.au=Costante%2C+Gabriele&rft.au=Valigi%2C+Paolo&rft.au=Ciarfuglia%2C+Thomas+A.&rft.date=2017-07-01&rft.issn=2377-3766&rft.eissn=2377-3766&rft.volume=2&rft.issue=3&rft.spage=1778&rft.epage=1785&rft_id=info:doi/10.1109%2FLRA.2017.2657002&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_LRA_2017_2657002 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2377-3766&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2377-3766&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2377-3766&client=summon |