Estimating Depth From Monocular Images as Classification Using Deep Fully Convolutional Residual Networks

Depth estimation from single monocular images is a key component in scene understanding. Most existing algorithms formulate depth estimation as a regression problem due to the continuous property of depths. However, the depth value of input data can hardly be regressed exactly to the ground-truth va...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on circuits and systems for video technology Vol. 28; no. 11; pp. 3174 - 3182
Main Authors Cao, Yuanzhouhan, Wu, Zifeng, Shen, Chunhua
Format Journal Article
LanguageEnglish
Published New York IEEE 01.11.2018
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Depth estimation from single monocular images is a key component in scene understanding. Most existing algorithms formulate depth estimation as a regression problem due to the continuous property of depths. However, the depth value of input data can hardly be regressed exactly to the ground-truth value. In this paper, we propose to formulate depth estimation as a pixelwise classification task. Specifically, we first discretize the continuous ground-truth depths into several bins and label the bins according to their depth ranges. Then, we solve the depth estimation problem as classification by training a fully convolutional deep residual network. Compared with estimating the exact depth of a single point, it is easier to estimate its depth range. More importantly, by performing depth classification instead of regression, we can easily obtain the confidence of a depth prediction in the form of probability distribution. With this confidence, we can apply an information gain loss to make use of the predictions that are close to ground-truth during training, as well as fully-connected conditional random fields for post-processing to further improve the performance. We test our proposed method on both indoor and outdoor benchmark RGB-Depth datasets and achieve state-of-the-art performance.
ISSN:1051-8215
1558-2205
DOI:10.1109/TCSVT.2017.2740321