Encoder-Decoder Structure With the Feature Pyramid for Depth Estimation From a Single Image
We address the problem of depth estimation from a single monocular image in the paper. Depth estimation from a single image is an ill-posed and inherently ambiguous problem. In the paper, we propose an encoder-decoder structure with the feature pyramid to predict the depth map from a single RGB imag...
Saved in:
Published in | IEEE access Vol. 9; pp. 22640 - 22650 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | We address the problem of depth estimation from a single monocular image in the paper. Depth estimation from a single image is an ill-posed and inherently ambiguous problem. In the paper, we propose an encoder-decoder structure with the feature pyramid to predict the depth map from a single RGB image. More specifically, the feature pyramid is used to detect objects of different scales in the image. The encoder structure aims to extract the most representative information from the original image through a series of convolution operations and to reduce the resolution of the input image. We adopt Res2-50 as the encoder to extract important features. The decoder section uses a novel upsampling structure to improve the output resolution. Then, we also propose a novel loss function that adds gradient loss and surface normal loss to the depth loss, which can predict not only the global depth but also the depth of fuzzy edges and small objects. Additionally, we use Adam as our optimization function to optimize our network and speed up convergence. Our extensive experimental evaluation proves the efficiency and effectiveness of the method, which is competitive with previous methods on the Make3D dataset and outperforms state-of-the-art methods on the NYU Depth v2 dataset. |
---|---|
AbstractList | We address the problem of depth estimation from a single monocular image in the paper. Depth estimation from a single image is an ill-posed and inherently ambiguous problem. In the paper, we propose an encoder-decoder structure with the feature pyramid to predict the depth map from a single RGB image. More specifically, the feature pyramid is used to detect objects of different scales in the image. The encoder structure aims to extract the most representative information from the original image through a series of convolution operations and to reduce the resolution of the input image. We adopt Res2-50 as the encoder to extract important features. The decoder section uses a novel upsampling structure to improve the output resolution. Then, we also propose a novel loss function that adds gradient loss and surface normal loss to the depth loss, which can predict not only the global depth but also the depth of fuzzy edges and small objects. Additionally, we use Adam as our optimization function to optimize our network and speed up convergence. Our extensive experimental evaluation proves the efficiency and effectiveness of the method, which is competitive with previous methods on the Make3D dataset and outperforms state-of-the-art methods on the NYU Depth v2 dataset. |
Author | Chen, Songnan Kan, Jiangming Dong, Ruifang Tang, Mengxia |
Author_xml | – sequence: 1 givenname: Mengxia orcidid: 0000-0003-4504-7163 surname: Tang fullname: Tang, Mengxia organization: School of Technology, Beijing Forestry University, Beijing, China – sequence: 2 givenname: Songnan orcidid: 0000-0003-0314-1194 surname: Chen fullname: Chen, Songnan organization: School of Technology, Beijing Forestry University, Beijing, China – sequence: 3 givenname: Ruifang orcidid: 0000-0001-7247-4131 surname: Dong fullname: Dong, Ruifang organization: School of Technology, Beijing Forestry University, Beijing, China – sequence: 4 givenname: Jiangming orcidid: 0000-0002-7326-7078 surname: Kan fullname: Kan, Jiangming email: kanjm@bjfu.edu.cn organization: School of Technology, Beijing Forestry University, Beijing, China |
BookMark | eNp9UU1rGzEQXUoKTdP8glwEPa-jz5V0DI7dGAIpuCWHHIRWGjky9srVyof8-yrepJQeOpcZnua9eeh9bs6GNEDTXBE8IwTr65v5fLFezyimZMawEFzLD805JZ1umWDd2V_zp-ZyHLe4lqqQkOfN02JwyUNub-HU0brkoyvHDOgxlmdUngEtwZ6A7y_Z7qNHIWV0C4f6uhhL3NsS04CWOe2RRes4bHaAVnu7gS_Nx2B3I1y-9Yvm53LxY37X3j98W81v7lvHsSotFdVzoIRZ4KCIE9yrHoeeSK2BCq-IpT5YxXXovHJeaLCSOA2SBdlTYBfNatL1yW7NIVdL-cUkG80JSHljbC7R7cCQegB3vQ3Cad4xqiyvhwL2TGree161vk5ah5x-HWEsZpuOeaj2DeVKSS6EFHVLT1sup3HMEIyL5fQPJdu4MwSb12jMFI15jca8RVO57B_uu-P_s64mVgSAPwzNOGYUs99Oy5t- |
CODEN | IAECCG |
CitedBy_id | crossref_primary_10_1109_JSEN_2024_3370821 crossref_primary_10_1007_s00521_022_07663_x crossref_primary_10_1007_s13198_024_02431_7 crossref_primary_10_1109_TCI_2023_3288335 crossref_primary_10_3390_app13179924 crossref_primary_10_3390_s23062919 |
Cites_doi | 10.1109/CVPR.2014.97 10.1117/12.526634 10.1109/CVPR.2014.19 10.1109/CVPR.2017.106 10.1109/CVPR.2010.5539823 10.1109/CVPR.2016.642 10.1109/TPAMI.2017.2699184 10.1109/CVPR.1997.609323 10.1364/OE.26.008179 10.1109/TPAMI.2010.161 10.1109/ICCV.2017.365 10.1109/CVPR.2018.00043 10.1109/TNNLS.2018.2876865 10.1109/CVPR.2016.594 10.1109/CVPR.2018.00042 10.1109/TPAMI.2008.132 10.1109/3DV.2018.00073 10.1109/TPAMI.2019.2938758 10.1109/CVPR.2017.25 10.1109/ICCV.2015.304 10.1109/CVPR.2006.23 10.1109/3DV.2016.32 10.1109/ICRA.2017.7989632 10.1109/CVPR42600.2020.00481 10.1109/CVPR.2015.7298972 10.1007/s11431-016-9017-6 10.1007/s11431-015-5828-x 10.1109/CVPR42600.2020.00073 10.1109/34.784284 10.1109/CVPR.2015.7298965 10.3390/s19030667 10.1109/CVPR42600.2020.00329 10.1109/CVPR.2019.00363 10.1109/CVPR42600.2020.00256 10.1109/ICCV.2017.533 10.1016/j.mechatronics.2017.12.009 10.1109/CVPR.2018.00037 10.1109/CVPR.2015.7299152 10.1109/CVPR.2017.238 10.1145/1186822.1073232 10.1109/CVPR.2017.699 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021 |
DBID | 97E ESBDL RIA RIE AAYXX CITATION 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D DOA |
DOI | 10.1109/ACCESS.2021.3055497 |
DatabaseName | IEEE Xplore (IEEE) IEEE Xplore Open Access Journals IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Engineered Materials Abstracts METADEX Technology Research Database Materials Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional DOAJ Directory of Open Access Journals |
DatabaseTitle | CrossRef Materials Research Database Engineered Materials Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace METADEX Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Materials Research Database |
Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 2169-3536 |
EndPage | 22650 |
ExternalDocumentID | oai_doaj_org_article_1e8106baf5c946328a4fb1f0d3794bd4 10_1109_ACCESS_2021_3055497 9340320 |
Genre | orig-research |
GrantInformation_xml | – fundername: Science and Technology Department of Henan Province grantid: 182102110160 funderid: 10.13039/501100011447 – fundername: National Natural Science Foundation of China grantid: 32071680 funderid: 10.13039/501100001809 |
GroupedDBID | 0R~ 4.4 5VS 6IK 97E AAJGR ABAZT ABVLG ACGFS ADBBV AGSQL ALMA_UNASSIGNED_HOLDINGS BCNDV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD ESBDL GROUPED_DOAJ IPLJI JAVBF KQ8 M43 M~E O9- OCL OK1 RIA RIE RNS AAYXX CITATION RIG 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c408t-25554f213ae4e81c54d8b0fb1799e25d81a2dfa849f6d8cd59ea71c9e73f7b2e3 |
IEDL.DBID | DOA |
ISSN | 2169-3536 |
IngestDate | Wed Aug 27 01:25:58 EDT 2025 Mon Jun 30 03:30:03 EDT 2025 Thu Apr 24 22:55:49 EDT 2025 Tue Jul 01 04:03:12 EDT 2025 Wed Aug 27 05:45:08 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
License | https://creativecommons.org/licenses/by/4.0/legalcode |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c408t-25554f213ae4e81c54d8b0fb1799e25d81a2dfa849f6d8cd59ea71c9e73f7b2e3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0001-7247-4131 0000-0003-0314-1194 0000-0002-7326-7078 0000-0003-4504-7163 |
OpenAccessLink | https://doaj.org/article/1e8106baf5c946328a4fb1f0d3794bd4 |
PQID | 2488745575 |
PQPubID | 4845423 |
PageCount | 11 |
ParticipantIDs | proquest_journals_2488745575 doaj_primary_oai_doaj_org_article_1e8106baf5c946328a4fb1f0d3794bd4 crossref_primary_10_1109_ACCESS_2021_3055497 ieee_primary_9340320 crossref_citationtrail_10_1109_ACCESS_2021_3055497 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 20210000 2021-00-00 20210101 2021-01-01 |
PublicationDateYYYYMMDD | 2021-01-01 |
PublicationDate_xml | – year: 2021 text: 20210000 |
PublicationDecade | 2020 |
PublicationPlace | Piscataway |
PublicationPlace_xml | – name: Piscataway |
PublicationTitle | IEEE access |
PublicationTitleAbbrev | Access |
PublicationYear | 2021 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref13 ref12 ref15 karsch (ref25) 2012 ref14 li (ref38) 2015 ref11 eigen (ref34) 2014 garg (ref39) 2016 ref17 ref16 ref19 ref18 ref46 ref45 ref47 ref42 ref41 ref44 ref43 ref49 saxena (ref24) 2005 kingma (ref48) 2014 ref8 ref7 ref4 ref3 ref6 ref5 ref40 ref35 lee (ref9) 2011; 2 ref37 ref36 ref31 ref30 ref33 ref32 ref2 ref1 tang (ref10) 2020 ref23 ref26 ref20 ref22 ref21 ref28 ref27 ref29 |
References_xml | – ident: ref29 doi: 10.1109/CVPR.2014.97 – ident: ref14 doi: 10.1117/12.526634 – ident: ref27 doi: 10.1109/CVPR.2014.19 – ident: ref22 doi: 10.1109/CVPR.2017.106 – ident: ref16 doi: 10.1109/CVPR.2010.5539823 – ident: ref20 doi: 10.1109/CVPR.2016.642 – start-page: 775 year: 2012 ident: ref25 article-title: Depth extraction from video using non-parametric sampling publication-title: Proc IEEE Conf Eur Conf Comput Vis – ident: ref17 doi: 10.1109/TPAMI.2017.2699184 – start-page: 2366 year: 2014 ident: ref34 article-title: Depth map prediction from a single image using a multi-scale deep network publication-title: Proc Int Conf Neural Inf Proc Syst – ident: ref13 doi: 10.1109/CVPR.1997.609323 – start-page: 1119 year: 2015 ident: ref38 article-title: Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs publication-title: Proc IEEE Conf Comput Vis Pattern Recognit – ident: ref6 doi: 10.1364/OE.26.008179 – start-page: 740 year: 2016 ident: ref39 article-title: Unsupervised CNN for single view depth estimation: Geometry to the rescue publication-title: Proc Eur Conf Comput Vis – ident: ref19 doi: 10.1109/TPAMI.2010.161 – ident: ref31 doi: 10.1109/ICCV.2017.365 – ident: ref43 doi: 10.1109/CVPR.2018.00043 – ident: ref1 doi: 10.1109/TNNLS.2018.2876865 – ident: ref30 doi: 10.1109/CVPR.2016.594 – year: 2014 ident: ref48 article-title: Adam: A method for stochastic optimization publication-title: arXiv 1412 6980 – ident: ref33 doi: 10.1109/CVPR.2018.00042 – ident: ref15 doi: 10.1109/TPAMI.2008.132 – ident: ref42 doi: 10.1109/3DV.2018.00073 – ident: ref23 doi: 10.1109/TPAMI.2019.2938758 – ident: ref37 doi: 10.1109/CVPR.2017.25 – ident: ref21 doi: 10.1109/ICCV.2015.304 – ident: ref28 doi: 10.1109/CVPR.2006.23 – year: 2020 ident: ref10 article-title: An overview of perception and decision-making in autonomous systems in the era of learning publication-title: arXiv 2001 02319 – ident: ref32 doi: 10.1109/3DV.2016.32 – ident: ref8 doi: 10.1109/ICRA.2017.7989632 – volume: 2 start-page: 126 year: 2011 ident: ref9 article-title: Depth-assisted real-time 3D object detection for augmented reality publication-title: Proc ICAT – ident: ref46 doi: 10.1109/CVPR42600.2020.00481 – ident: ref12 doi: 10.1109/CVPR.2015.7298972 – ident: ref3 doi: 10.1007/s11431-016-9017-6 – start-page: 1161 year: 2005 ident: ref24 article-title: Learning depth from single monocular images publication-title: Proc Int Conf Neural Inf Proc Syst – ident: ref5 doi: 10.1007/s11431-015-5828-x – ident: ref44 doi: 10.1109/CVPR42600.2020.00073 – ident: ref11 doi: 10.1109/34.784284 – ident: ref18 doi: 10.1109/CVPR.2015.7298965 – ident: ref35 doi: 10.3390/s19030667 – ident: ref47 doi: 10.1109/CVPR42600.2020.00329 – ident: ref2 doi: 10.1109/CVPR.2019.00363 – ident: ref45 doi: 10.1109/CVPR42600.2020.00256 – ident: ref4 doi: 10.1109/ICCV.2017.533 – ident: ref7 doi: 10.1016/j.mechatronics.2017.12.009 – ident: ref49 doi: 10.1109/CVPR.2018.00037 – ident: ref36 doi: 10.1109/CVPR.2015.7299152 – ident: ref41 doi: 10.1109/CVPR.2017.238 – ident: ref26 doi: 10.1145/1186822.1073232 – ident: ref40 doi: 10.1109/CVPR.2017.699 |
SSID | ssj0000816957 |
Score | 2.2339437 |
Snippet | We address the problem of depth estimation from a single monocular image in the paper. Depth estimation from a single image is an ill-posed and inherently... |
SourceID | doaj proquest crossref ieee |
SourceType | Open Website Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 22640 |
SubjectTerms | Coders Convolution Datasets Decoding Depth prediction encoder-decoder Encoders-Decoders Estimation Feature extraction feature pyramid Object recognition Optimization Periodic structures single image Task analysis Three-dimensional displays |
SummonAdditionalLinks | – databaseName: IEEE Electronic Library (IEL) dbid: RIE link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1BbxQhFCZtT3rQajWubQ0Hj50tA8wOHNvtbqqJxqQ2NvFAHvAmNu3uNuvuwf76PpjZSaPGeJoJAwTyweN7DHyPsfdgGhBe1wXUQhTaWyggNKaovY8SogKAfNri8-j8Un-8qq622FF_FwYR8-EzHKbX_C8_LsI6bZUdW6VTvO9ttk2OW3tXq99PSQEkbFV3wkKlsMcn4zH1gVxAWQ6TrpVOwk6PFp-s0d8FVfnDEuflZfqcfdo0rD1VcjNcr_ww3P-m2fi_Ld9lzzqeyU_agfGCbeH8JXv6SH1wj32fzNOF9mVxhvnJL7KW7HqJ_Nv16gcnasgTQ0wJX34tYXYdOVFcfoZ39HVCtqG99siny8WMA7-gWm-Rf5iRiXrFLqeTr-Pzoou1UAQtzKogz6LSjSwVoEZThkpH40Xjk2AcyiqaEmRswGjbjFLAo8oi1GWwWKum9hLVa7YzX8zxDeNCKZAqyEBkRusKrCEaAUpR6VgiygGTGxBc6ITIUzyMW5cdEmFdi5xLyLkOuQE76gvdtToc_85-mtDtsyYR7ZxAqLhuTrqSOipGHpoqWD1S0oCmDjciKjJSPuoB20tI9pV0IA7YwWasuG7C_3SSDGGtKyK_b_9eap89SQ1sd28O2A4hiofEZ1b-XR7ID-GC8cI priority: 102 providerName: IEEE |
Title | Encoder-Decoder Structure With the Feature Pyramid for Depth Estimation From a Single Image |
URI | https://ieeexplore.ieee.org/document/9340320 https://www.proquest.com/docview/2488745575 https://doaj.org/article/1e8106baf5c946328a4fb1f0d3794bd4 |
Volume | 9 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LTxsxELYqTvSAKBQ1vOQDR7b4GdtHGhJRDqgSRSD1YI3XtkAiCUrTQ_99x94lioQEl55W8tpee2Y8_mZlf0PICdgMLCjTgGGsUcFBA222jQkhCogSAOppi-vh5a26utf3a6m-ypmwjh64E9wZTxajlgBZt04NpbCgcuCZRYmWFGJlAsU9by2Yqj7Y8qHTpqcZ4sydnY9GOCMMCAX_WliuVKF5WtuKKmN_n2LllV-um81km2z1KJGed6P7RD6k2Q75uMYduEt-jWflOvqiuUj1SW8qE-yfRaJ3j8sHisCOFnxXCn78XcD0MVIEqPQiPePbMa7s7tIinSzmUwr0Bnt9SvT7FB3MZ3I7Gf8cXTZ9poSmVcwuG4wLtMqCS0gK5dVqFW1gKCfjXBI6Wg4iZrDK5WFJV6RdAsNbl4zMJogk98jGbD5LXwhlUoKQrWgRiiilwVkEASAlto48JTEg4kVovu1pxEs2iydfwwnmfCdpXyTte0kPyOmq0XPHovF29W9FG6uqhQK7FqBh-N4w_HuGMSC7RZerTpxUJVn8gBy-6Nb3y_W3F-jGjNIIXff_x6cPyGaZTven5pBsoP7TEWKXZTiuZnpcrxn-A8vM6Xc |
linkProvider | Directory of Open Access Journals |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LbxMxEB6VcgAOvAoiUMAHjt3U60d2fSxpohTaCqmtqMTBGj9WVDRJFZID_HrG3k1UAUKcduW1rbG-8Xjs9XwD8A7rBrlTVYEV54VyBgv0TV1UzgWBQSJivm1xOphcqA-X-nIL9jaxMDHGfPks9tNr_pcf5n6Vjsr2jVQp3_cduEvrvi7baK3NiUpKIWF01VELldzsHwyHNAraBIqyn5itVKJ2urX8ZJb-Lq3KH7Y4LzDjR3CyFq29V_Ktv1q6vv_5G2vj_8r-GB52niY7aFXjCWzF2VN4cIt_cAe-jGYppH1RHMb8ZGeZTXa1iOzz1fIrI-eQJR8xFXz6scDpVWDk5LLDeENfR2Qd2sBHNl7MpwzZGfV6HdnRlIzUM7gYj86Hk6LLtlB4xetlQXsLrRpRSowq1qXXKtSONy5RxkWhQ12iCA3WyjSDlPJIm4hV6U2sZFM5EeVz2J7NZ_EFMC4lCumFJ3dGKY2mJkcCpaTWoYxR9ECsQbC-oyJPGTGubd6ScGNb5GxCznbI9WBv0-imZeL4d_X3Cd1N1USjnQsIFdvNSlvSQPnAYaO9UQMpalQ04IYHSWbKBdWDnYTkppMOxB7srnXFdlP-uxVkCitSy0q__Hurt3Bvcn5ybI-PTj--gvtJ2PYsZxe2Cd34mrybpXuTlfoXo5D1Cw |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Encoder-Decoder+Structure+With+the+Feature+Pyramid+for+Depth+Estimation+From+a+Single+Image&rft.jtitle=IEEE+access&rft.au=Tang%2C+Mengxia&rft.au=Chen%2C+Songnan&rft.au=Dong%2C+Ruifang&rft.au=Kan%2C+Jiangming&rft.date=2021&rft.pub=IEEE&rft.eissn=2169-3536&rft.volume=9&rft.spage=22640&rft.epage=22650&rft_id=info:doi/10.1109%2FACCESS.2021.3055497&rft.externalDocID=9340320 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon |