Orthogonal Adaptation for Modular Customization of Diffusion Models
Customization techniques for text-to-image models have paved the way for a wide range of previously unattainable applications, enabling the generation of specific concepts across diverse contexts and styles. While existing methods facilitate high-fidelity customization for individual concepts or a l...
Saved in:
Main Authors | , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
04.12.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Customization techniques for text-to-image models have paved the way for a
wide range of previously unattainable applications, enabling the generation of
specific concepts across diverse contexts and styles. While existing methods
facilitate high-fidelity customization for individual concepts or a limited,
pre-defined set of them, they fall short of achieving scalability, where a
single model can seamlessly render countless concepts. In this paper, we
address a new problem called Modular Customization, with the goal of
efficiently merging customized models that were fine-tuned independently for
individual concepts. This allows the merged model to jointly synthesize
concepts in one image without compromising fidelity or incurring any additional
computational costs. To address this problem, we introduce Orthogonal
Adaptation, a method designed to encourage the customized models, which do not
have access to each other during fine-tuning, to have orthogonal residual
weights. This ensures that during inference time, the customized models can be
summed with minimal interference. Our proposed method is both simple and
versatile, applicable to nearly all optimizable weights in the model
architecture. Through an extensive set of quantitative and qualitative
evaluations, our method consistently outperforms relevant baselines in terms of
efficiency and identity preservation, demonstrating a significant leap toward
scalable customization of diffusion models. |
---|---|
DOI: | 10.48550/arxiv.2312.02432 |