Bridging Implicit and Explicit Geometric Transformation for Single-Image View Synthesis

Creating novel views from a single image has achieved tremendous strides with advanced autoregressive models, as unseen regions have to be inferred from the visible scene contents. Although recent methods generate high-quality novel views, synthesizing with only one explicit or implicit 3D geometry...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Park, Byeongjun, Go, Hyojun, Kim, Changick
Format	Paper Journal Article
Language	English
Published	Ithaca Cornell University Library, arXiv.org 15.03.2024
Subjects	Autoregressive models Computer Science - Artificial Intelligence Computer Science - Computer Vision and Pattern Recognition Geometric transformation Implicit methods Renderers Synthesis
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Creating novel views from a single image has achieved tremendous strides with advanced autoregressive models, as unseen regions have to be inferred from the visible scene contents. Although recent methods generate high-quality novel views, synthesizing with only one explicit or implicit 3D geometry has a trade-off between two objectives that we call the "seesaw" problem: 1) preserving reprojected contents and 2) completing realistic out-of-view regions. Also, autoregressive models require a considerable computational cost. In this paper, we propose a single-image view synthesis framework for mitigating the seesaw problem while utilizing an efficient non-autoregressive model. Motivated by the characteristics that explicit methods well preserve reprojected pixels and implicit methods complete realistic out-of-view regions, we introduce a loss function to complement two renderers. Our loss function promotes that explicit features improve the reprojected area of implicit features and implicit features improve the out-of-view area of explicit features. With the proposed architecture and loss function, we can alleviate the seesaw problem, outperforming autoregressive-based state-of-the-art methods and generating an image \(\approx\)100 times faster. We validate the efficiency and effectiveness of our method with experiments on RealEstate10K and ACID datasets.
ISSN:	2331-8422
DOI:	10.48550/arxiv.2209.07105