SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior
Novel View Synthesis (NVS) for street scenes play a critical role in the autonomous driving simulation. The current mainstream technique to achieve it is neural rendering, such as Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS). Although thrilling progress has been made, when handling...
Saved in:
Main Authors | , , , , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
29.03.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Novel View Synthesis (NVS) for street scenes play a critical role in the
autonomous driving simulation. The current mainstream technique to achieve it
is neural rendering, such as Neural Radiance Fields (NeRF) and 3D Gaussian
Splatting (3DGS). Although thrilling progress has been made, when handling
street scenes, current methods struggle to maintain rendering quality at the
viewpoint that deviates significantly from the training viewpoints. This issue
stems from the sparse training views captured by a fixed camera on a moving
vehicle. To tackle this problem, we propose a novel approach that enhances the
capacity of 3DGS by leveraging prior from a Diffusion Model along with
complementary multi-modal data. Specifically, we first fine-tune a Diffusion
Model by adding images from adjacent frames as condition, meanwhile exploiting
depth data from LiDAR point clouds to supply additional spatial information.
Then we apply the Diffusion Model to regularize the 3DGS at unseen views during
training. Experimental results validate the effectiveness of our method
compared with current state-of-the-art models, and demonstrate its advance in
rendering images from broader views. |
---|---|
DOI: | 10.48550/arxiv.2403.20079 |