A Two-Streamed Network for Estimating Fine-Scaled Depth Maps from Single RGB Images

Estimating depth from a single RGB image is an ill-posed and inherently ambiguous problem. State-of-the-art deep learning methods can now estimate accurate 2D depth maps, but when the maps are projected into 3D, they lack local detail and are often highly distorted. We propose a fast-to-train two-st...

Full description

Saved in:

Bibliographic Details
Published in	2017 IEEE International Conference on Computer Vision (ICCV) pp. 3392 - 3400
Main Authors	Jun Li, Klein, Reinhard, Yao, Angela
Format	Conference Proceeding
Language	English
Published	IEEE 01.10.2017
Subjects	Convolution Estimation Image resolution Optimization Three-dimensional displays Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Estimating depth from a single RGB image is an ill-posed and inherently ambiguous problem. State-of-the-art deep learning methods can now estimate accurate 2D depth maps, but when the maps are projected into 3D, they lack local detail and are often highly distorted. We propose a fast-to-train two-streamed CNN that predicts depth and depth gradients, which are then fused together into an accurate and detailed depth map. We also define a novel set loss over multiple images; by regularizing the estimation between a common set of images, the network is less prone to overfitting and achieves better accuracy than competing methods. Experiments on the NYU Depth v2 dataset shows that our depth predictions are competitive with state-of-the-art and lead to faithful 3D projections.
ISSN:	2380-7504
DOI:	10.1109/ICCV.2017.365