Encoder-Decoder Structure With the Feature Pyramid for Depth Estimation From a Single Image

We address the problem of depth estimation from a single monocular image in the paper. Depth estimation from a single image is an ill-posed and inherently ambiguous problem. In the paper, we propose an encoder-decoder structure with the feature pyramid to predict the depth map from a single RGB imag...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 9; pp. 22640 - 22650
Main Authors Tang, Mengxia, Chen, Songnan, Dong, Ruifang, Kan, Jiangming
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
Abstract We address the problem of depth estimation from a single monocular image in the paper. Depth estimation from a single image is an ill-posed and inherently ambiguous problem. In the paper, we propose an encoder-decoder structure with the feature pyramid to predict the depth map from a single RGB image. More specifically, the feature pyramid is used to detect objects of different scales in the image. The encoder structure aims to extract the most representative information from the original image through a series of convolution operations and to reduce the resolution of the input image. We adopt Res2-50 as the encoder to extract important features. The decoder section uses a novel upsampling structure to improve the output resolution. Then, we also propose a novel loss function that adds gradient loss and surface normal loss to the depth loss, which can predict not only the global depth but also the depth of fuzzy edges and small objects. Additionally, we use Adam as our optimization function to optimize our network and speed up convergence. Our extensive experimental evaluation proves the efficiency and effectiveness of the method, which is competitive with previous methods on the Make3D dataset and outperforms state-of-the-art methods on the NYU Depth v2 dataset.
AbstractList We address the problem of depth estimation from a single monocular image in the paper. Depth estimation from a single image is an ill-posed and inherently ambiguous problem. In the paper, we propose an encoder-decoder structure with the feature pyramid to predict the depth map from a single RGB image. More specifically, the feature pyramid is used to detect objects of different scales in the image. The encoder structure aims to extract the most representative information from the original image through a series of convolution operations and to reduce the resolution of the input image. We adopt Res2-50 as the encoder to extract important features. The decoder section uses a novel upsampling structure to improve the output resolution. Then, we also propose a novel loss function that adds gradient loss and surface normal loss to the depth loss, which can predict not only the global depth but also the depth of fuzzy edges and small objects. Additionally, we use Adam as our optimization function to optimize our network and speed up convergence. Our extensive experimental evaluation proves the efficiency and effectiveness of the method, which is competitive with previous methods on the Make3D dataset and outperforms state-of-the-art methods on the NYU Depth v2 dataset.
Author Chen, Songnan
Kan, Jiangming
Dong, Ruifang
Tang, Mengxia
Author_xml – sequence: 1
  givenname: Mengxia
  orcidid: 0000-0003-4504-7163
  surname: Tang
  fullname: Tang, Mengxia
  organization: School of Technology, Beijing Forestry University, Beijing, China
– sequence: 2
  givenname: Songnan
  orcidid: 0000-0003-0314-1194
  surname: Chen
  fullname: Chen, Songnan
  organization: School of Technology, Beijing Forestry University, Beijing, China
– sequence: 3
  givenname: Ruifang
  orcidid: 0000-0001-7247-4131
  surname: Dong
  fullname: Dong, Ruifang
  organization: School of Technology, Beijing Forestry University, Beijing, China
– sequence: 4
  givenname: Jiangming
  orcidid: 0000-0002-7326-7078
  surname: Kan
  fullname: Kan, Jiangming
  email: kanjm@bjfu.edu.cn
  organization: School of Technology, Beijing Forestry University, Beijing, China
BookMark eNp9UU1rGzEQXUoKTdP8glwEPa-jz5V0DI7dGAIpuCWHHIRWGjky9srVyof8-yrepJQeOpcZnua9eeh9bs6GNEDTXBE8IwTr65v5fLFezyimZMawEFzLD805JZ1umWDd2V_zp-ZyHLe4lqqQkOfN02JwyUNub-HU0brkoyvHDOgxlmdUngEtwZ6A7y_Z7qNHIWV0C4f6uhhL3NsS04CWOe2RRes4bHaAVnu7gS_Nx2B3I1y-9Yvm53LxY37X3j98W81v7lvHsSotFdVzoIRZ4KCIE9yrHoeeSK2BCq-IpT5YxXXovHJeaLCSOA2SBdlTYBfNatL1yW7NIVdL-cUkG80JSHljbC7R7cCQegB3vQ3Cad4xqiyvhwL2TGree161vk5ah5x-HWEsZpuOeaj2DeVKSS6EFHVLT1sup3HMEIyL5fQPJdu4MwSb12jMFI15jca8RVO57B_uu-P_s64mVgSAPwzNOGYUs99Oy5t-
CODEN IAECCG
CitedBy_id crossref_primary_10_1109_JSEN_2024_3370821
crossref_primary_10_1007_s00521_022_07663_x
crossref_primary_10_1007_s13198_024_02431_7
crossref_primary_10_1109_TCI_2023_3288335
crossref_primary_10_3390_app13179924
crossref_primary_10_3390_s23062919
Cites_doi 10.1109/CVPR.2014.97
10.1117/12.526634
10.1109/CVPR.2014.19
10.1109/CVPR.2017.106
10.1109/CVPR.2010.5539823
10.1109/CVPR.2016.642
10.1109/TPAMI.2017.2699184
10.1109/CVPR.1997.609323
10.1364/OE.26.008179
10.1109/TPAMI.2010.161
10.1109/ICCV.2017.365
10.1109/CVPR.2018.00043
10.1109/TNNLS.2018.2876865
10.1109/CVPR.2016.594
10.1109/CVPR.2018.00042
10.1109/TPAMI.2008.132
10.1109/3DV.2018.00073
10.1109/TPAMI.2019.2938758
10.1109/CVPR.2017.25
10.1109/ICCV.2015.304
10.1109/CVPR.2006.23
10.1109/3DV.2016.32
10.1109/ICRA.2017.7989632
10.1109/CVPR42600.2020.00481
10.1109/CVPR.2015.7298972
10.1007/s11431-016-9017-6
10.1007/s11431-015-5828-x
10.1109/CVPR42600.2020.00073
10.1109/34.784284
10.1109/CVPR.2015.7298965
10.3390/s19030667
10.1109/CVPR42600.2020.00329
10.1109/CVPR.2019.00363
10.1109/CVPR42600.2020.00256
10.1109/ICCV.2017.533
10.1016/j.mechatronics.2017.12.009
10.1109/CVPR.2018.00037
10.1109/CVPR.2015.7299152
10.1109/CVPR.2017.238
10.1145/1186822.1073232
10.1109/CVPR.2017.699
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021
DBID 97E
ESBDL
RIA
RIE
AAYXX
CITATION
7SC
7SP
7SR
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
DOA
DOI 10.1109/ACCESS.2021.3055497
DatabaseName IEEE Xplore (IEEE)
IEEE Xplore Open Access Journals
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Engineered Materials Abstracts
METADEX
Technology Research Database
Materials Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
Materials Research Database
Engineered Materials Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
METADEX
Computer and Information Systems Abstracts Professional
DatabaseTitleList

Materials Research Database
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2169-3536
EndPage 22650
ExternalDocumentID oai_doaj_org_article_1e8106baf5c946328a4fb1f0d3794bd4
10_1109_ACCESS_2021_3055497
9340320
Genre orig-research
GrantInformation_xml – fundername: Science and Technology Department of Henan Province
  grantid: 182102110160
  funderid: 10.13039/501100011447
– fundername: National Natural Science Foundation of China
  grantid: 32071680
  funderid: 10.13039/501100001809
GroupedDBID 0R~
4.4
5VS
6IK
97E
AAJGR
ABAZT
ABVLG
ACGFS
ADBBV
AGSQL
ALMA_UNASSIGNED_HOLDINGS
BCNDV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
ESBDL
GROUPED_DOAJ
IPLJI
JAVBF
KQ8
M43
M~E
O9-
OCL
OK1
RIA
RIE
RNS
AAYXX
CITATION
RIG
7SC
7SP
7SR
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c408t-25554f213ae4e81c54d8b0fb1799e25d81a2dfa849f6d8cd59ea71c9e73f7b2e3
IEDL.DBID DOA
ISSN 2169-3536
IngestDate Wed Aug 27 01:25:58 EDT 2025
Mon Jun 30 03:30:03 EDT 2025
Thu Apr 24 22:55:49 EDT 2025
Tue Jul 01 04:03:12 EDT 2025
Wed Aug 27 05:45:08 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Language English
License https://creativecommons.org/licenses/by/4.0/legalcode
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c408t-25554f213ae4e81c54d8b0fb1799e25d81a2dfa849f6d8cd59ea71c9e73f7b2e3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-7247-4131
0000-0003-0314-1194
0000-0002-7326-7078
0000-0003-4504-7163
OpenAccessLink https://doaj.org/article/1e8106baf5c946328a4fb1f0d3794bd4
PQID 2488745575
PQPubID 4845423
PageCount 11
ParticipantIDs proquest_journals_2488745575
doaj_primary_oai_doaj_org_article_1e8106baf5c946328a4fb1f0d3794bd4
crossref_primary_10_1109_ACCESS_2021_3055497
ieee_primary_9340320
crossref_citationtrail_10_1109_ACCESS_2021_3055497
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 20210000
2021-00-00
20210101
2021-01-01
PublicationDateYYYYMMDD 2021-01-01
PublicationDate_xml – year: 2021
  text: 20210000
PublicationDecade 2020
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE access
PublicationTitleAbbrev Access
PublicationYear 2021
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref12
ref15
karsch (ref25) 2012
ref14
li (ref38) 2015
ref11
eigen (ref34) 2014
garg (ref39) 2016
ref17
ref16
ref19
ref18
ref46
ref45
ref47
ref42
ref41
ref44
ref43
ref49
saxena (ref24) 2005
kingma (ref48) 2014
ref8
ref7
ref4
ref3
ref6
ref5
ref40
ref35
lee (ref9) 2011; 2
ref37
ref36
ref31
ref30
ref33
ref32
ref2
ref1
tang (ref10) 2020
ref23
ref26
ref20
ref22
ref21
ref28
ref27
ref29
References_xml – ident: ref29
  doi: 10.1109/CVPR.2014.97
– ident: ref14
  doi: 10.1117/12.526634
– ident: ref27
  doi: 10.1109/CVPR.2014.19
– ident: ref22
  doi: 10.1109/CVPR.2017.106
– ident: ref16
  doi: 10.1109/CVPR.2010.5539823
– ident: ref20
  doi: 10.1109/CVPR.2016.642
– start-page: 775
  year: 2012
  ident: ref25
  article-title: Depth extraction from video using non-parametric sampling
  publication-title: Proc IEEE Conf Eur Conf Comput Vis
– ident: ref17
  doi: 10.1109/TPAMI.2017.2699184
– start-page: 2366
  year: 2014
  ident: ref34
  article-title: Depth map prediction from a single image using a multi-scale deep network
  publication-title: Proc Int Conf Neural Inf Proc Syst
– ident: ref13
  doi: 10.1109/CVPR.1997.609323
– start-page: 1119
  year: 2015
  ident: ref38
  article-title: Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs
  publication-title: Proc IEEE Conf Comput Vis Pattern Recognit
– ident: ref6
  doi: 10.1364/OE.26.008179
– start-page: 740
  year: 2016
  ident: ref39
  article-title: Unsupervised CNN for single view depth estimation: Geometry to the rescue
  publication-title: Proc Eur Conf Comput Vis
– ident: ref19
  doi: 10.1109/TPAMI.2010.161
– ident: ref31
  doi: 10.1109/ICCV.2017.365
– ident: ref43
  doi: 10.1109/CVPR.2018.00043
– ident: ref1
  doi: 10.1109/TNNLS.2018.2876865
– ident: ref30
  doi: 10.1109/CVPR.2016.594
– year: 2014
  ident: ref48
  article-title: Adam: A method for stochastic optimization
  publication-title: arXiv 1412 6980
– ident: ref33
  doi: 10.1109/CVPR.2018.00042
– ident: ref15
  doi: 10.1109/TPAMI.2008.132
– ident: ref42
  doi: 10.1109/3DV.2018.00073
– ident: ref23
  doi: 10.1109/TPAMI.2019.2938758
– ident: ref37
  doi: 10.1109/CVPR.2017.25
– ident: ref21
  doi: 10.1109/ICCV.2015.304
– ident: ref28
  doi: 10.1109/CVPR.2006.23
– year: 2020
  ident: ref10
  article-title: An overview of perception and decision-making in autonomous systems in the era of learning
  publication-title: arXiv 2001 02319
– ident: ref32
  doi: 10.1109/3DV.2016.32
– ident: ref8
  doi: 10.1109/ICRA.2017.7989632
– volume: 2
  start-page: 126
  year: 2011
  ident: ref9
  article-title: Depth-assisted real-time 3D object detection for augmented reality
  publication-title: Proc ICAT
– ident: ref46
  doi: 10.1109/CVPR42600.2020.00481
– ident: ref12
  doi: 10.1109/CVPR.2015.7298972
– ident: ref3
  doi: 10.1007/s11431-016-9017-6
– start-page: 1161
  year: 2005
  ident: ref24
  article-title: Learning depth from single monocular images
  publication-title: Proc Int Conf Neural Inf Proc Syst
– ident: ref5
  doi: 10.1007/s11431-015-5828-x
– ident: ref44
  doi: 10.1109/CVPR42600.2020.00073
– ident: ref11
  doi: 10.1109/34.784284
– ident: ref18
  doi: 10.1109/CVPR.2015.7298965
– ident: ref35
  doi: 10.3390/s19030667
– ident: ref47
  doi: 10.1109/CVPR42600.2020.00329
– ident: ref2
  doi: 10.1109/CVPR.2019.00363
– ident: ref45
  doi: 10.1109/CVPR42600.2020.00256
– ident: ref4
  doi: 10.1109/ICCV.2017.533
– ident: ref7
  doi: 10.1016/j.mechatronics.2017.12.009
– ident: ref49
  doi: 10.1109/CVPR.2018.00037
– ident: ref36
  doi: 10.1109/CVPR.2015.7299152
– ident: ref41
  doi: 10.1109/CVPR.2017.238
– ident: ref26
  doi: 10.1145/1186822.1073232
– ident: ref40
  doi: 10.1109/CVPR.2017.699
SSID ssj0000816957
Score 2.2339437
Snippet We address the problem of depth estimation from a single monocular image in the paper. Depth estimation from a single image is an ill-posed and inherently...
SourceID doaj
proquest
crossref
ieee
SourceType Open Website
Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 22640
SubjectTerms Coders
Convolution
Datasets
Decoding
Depth prediction
encoder-decoder
Encoders-Decoders
Estimation
Feature extraction
feature pyramid
Object recognition
Optimization
Periodic structures
single image
Task analysis
Three-dimensional displays
SummonAdditionalLinks – databaseName: IEEE Electronic Library (IEL)
  dbid: RIE
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1BbxQhFCZtT3rQajWubQ0Hj50tA8wOHNvtbqqJxqQ2NvFAHvAmNu3uNuvuwf76PpjZSaPGeJoJAwTyweN7DHyPsfdgGhBe1wXUQhTaWyggNKaovY8SogKAfNri8-j8Un-8qq622FF_FwYR8-EzHKbX_C8_LsI6bZUdW6VTvO9ttk2OW3tXq99PSQEkbFV3wkKlsMcn4zH1gVxAWQ6TrpVOwk6PFp-s0d8FVfnDEuflZfqcfdo0rD1VcjNcr_ww3P-m2fi_Ld9lzzqeyU_agfGCbeH8JXv6SH1wj32fzNOF9mVxhvnJL7KW7HqJ_Nv16gcnasgTQ0wJX34tYXYdOVFcfoZ39HVCtqG99siny8WMA7-gWm-Rf5iRiXrFLqeTr-Pzoou1UAQtzKogz6LSjSwVoEZThkpH40Xjk2AcyiqaEmRswGjbjFLAo8oi1GWwWKum9hLVa7YzX8zxDeNCKZAqyEBkRusKrCEaAUpR6VgiygGTGxBc6ITIUzyMW5cdEmFdi5xLyLkOuQE76gvdtToc_85-mtDtsyYR7ZxAqLhuTrqSOipGHpoqWD1S0oCmDjciKjJSPuoB20tI9pV0IA7YwWasuG7C_3SSDGGtKyK_b_9eap89SQ1sd28O2A4hiofEZ1b-XR7ID-GC8cI
  priority: 102
  providerName: IEEE
Title Encoder-Decoder Structure With the Feature Pyramid for Depth Estimation From a Single Image
URI https://ieeexplore.ieee.org/document/9340320
https://www.proquest.com/docview/2488745575
https://doaj.org/article/1e8106baf5c946328a4fb1f0d3794bd4
Volume 9
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LTxsxELYqTvSAKBQ1vOQDR7b4GdtHGhJRDqgSRSD1YI3XtkAiCUrTQ_99x94lioQEl55W8tpee2Y8_mZlf0PICdgMLCjTgGGsUcFBA222jQkhCogSAOppi-vh5a26utf3a6m-ypmwjh64E9wZTxajlgBZt04NpbCgcuCZRYmWFGJlAsU9by2Yqj7Y8qHTpqcZ4sydnY9GOCMMCAX_WliuVKF5WtuKKmN_n2LllV-um81km2z1KJGed6P7RD6k2Q75uMYduEt-jWflOvqiuUj1SW8qE-yfRaJ3j8sHisCOFnxXCn78XcD0MVIEqPQiPePbMa7s7tIinSzmUwr0Bnt9SvT7FB3MZ3I7Gf8cXTZ9poSmVcwuG4wLtMqCS0gK5dVqFW1gKCfjXBI6Wg4iZrDK5WFJV6RdAsNbl4zMJogk98jGbD5LXwhlUoKQrWgRiiilwVkEASAlto48JTEg4kVovu1pxEs2iydfwwnmfCdpXyTte0kPyOmq0XPHovF29W9FG6uqhQK7FqBh-N4w_HuGMSC7RZerTpxUJVn8gBy-6Nb3y_W3F-jGjNIIXff_x6cPyGaZTven5pBsoP7TEWKXZTiuZnpcrxn-A8vM6Xc
linkProvider Directory of Open Access Journals
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LbxMxEB6VcgAOvAoiUMAHjt3U60d2fSxpohTaCqmtqMTBGj9WVDRJFZID_HrG3k1UAUKcduW1rbG-8Xjs9XwD8A7rBrlTVYEV54VyBgv0TV1UzgWBQSJivm1xOphcqA-X-nIL9jaxMDHGfPks9tNr_pcf5n6Vjsr2jVQp3_cduEvrvi7baK3NiUpKIWF01VELldzsHwyHNAraBIqyn5itVKJ2urX8ZJb-Lq3KH7Y4LzDjR3CyFq29V_Ktv1q6vv_5G2vj_8r-GB52niY7aFXjCWzF2VN4cIt_cAe-jGYppH1RHMb8ZGeZTXa1iOzz1fIrI-eQJR8xFXz6scDpVWDk5LLDeENfR2Qd2sBHNl7MpwzZGfV6HdnRlIzUM7gYj86Hk6LLtlB4xetlQXsLrRpRSowq1qXXKtSONy5RxkWhQ12iCA3WyjSDlPJIm4hV6U2sZFM5EeVz2J7NZ_EFMC4lCumFJ3dGKY2mJkcCpaTWoYxR9ECsQbC-oyJPGTGubd6ScGNb5GxCznbI9WBv0-imZeL4d_X3Cd1N1USjnQsIFdvNSlvSQPnAYaO9UQMpalQ04IYHSWbKBdWDnYTkppMOxB7srnXFdlP-uxVkCitSy0q__Hurt3Bvcn5ybI-PTj--gvtJ2PYsZxe2Cd34mrybpXuTlfoXo5D1Cw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Encoder-Decoder+Structure+With+the+Feature+Pyramid+for+Depth+Estimation+From+a+Single+Image&rft.jtitle=IEEE+access&rft.au=Tang%2C+Mengxia&rft.au=Chen%2C+Songnan&rft.au=Dong%2C+Ruifang&rft.au=Kan%2C+Jiangming&rft.date=2021&rft.pub=IEEE&rft.eissn=2169-3536&rft.volume=9&rft.spage=22640&rft.epage=22650&rft_id=info:doi/10.1109%2FACCESS.2021.3055497&rft.externalDocID=9340320
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon