Make-Your-3D: Fast and Consistent Subject-Driven 3D Content Generation

Recent years have witnessed the strong power of 3D generation models, which offer a new level of creative flexibility by allowing users to guide the 3D content generation process through a single image or natural language. However, it remains challenging for existing 3D generation methods to create...

Full description

Saved in:

Bibliographic Details
Main Authors	Liu, Fangfu, Wang, Hanyang, Chen, Weiliang, Sun, Haowen, Duan, Yueqi
Format	Journal Article
Language	English
Published	14.03.2024
Subjects	Computer Science - Computer Vision and Pattern Recognition Computer Science - Learning
Online Access	Get full text
DOI	10.48550/arxiv.2403.09625

Cover

More Information
Summary:	Recent years have witnessed the strong power of 3D generation models, which offer a new level of creative flexibility by allowing users to guide the 3D content generation process through a single image or natural language. However, it remains challenging for existing 3D generation methods to create subject-driven 3D content across diverse prompts. In this paper, we introduce a novel 3D customization method, dubbed Make-Your-3D that can personalize high-fidelity and consistent 3D content from only a single image of a subject with text description within 5 minutes. Our key insight is to harmonize the distributions of a multi-view diffusion model and an identity-specific 2D generative model, aligning them with the distribution of the desired 3D subject. Specifically, we design a co-evolution framework to reduce the variance of distributions, where each model undergoes a process of learning from the other through identity-aware optimization and subject-prior optimization, respectively. Extensive experiments demonstrate that our method can produce high-quality, consistent, and subject-specific 3D content with text-driven modifications that are unseen in subject image.
DOI:	10.48550/arxiv.2403.09625