Automode: The Automatic LoRA Based on Stable Diffusion

Stable Diffusion is a powerful tool for image generation but encounters difficulties in generating images of untrained subjects, such as newly introduced customized content. Traditional fine-tuning methods often require the manual creation of datasets, followed by computationally expensive full-para...

Full description

Saved in:
Bibliographic Details
Published in2024 IEEE 8th International Conference on Vision, Image and Signal Processing (ICVISP) pp. 1 - 5
Main Authors Wu, Yao, Lu, Yifan
Format Conference Proceeding
LanguageEnglish
Published IEEE 27.12.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Stable Diffusion is a powerful tool for image generation but encounters difficulties in generating images of untrained subjects, such as newly introduced customized content. Traditional fine-tuning methods often require the manual creation of datasets, followed by computationally expensive full-parameter training or low-rank adaptation, as seen in LoRA. While LoRA mitigates the cost associated with full-parameter training, traditional methods still face challenges regarding scalability and efficiency. To address these limitations, we introduce a novel automated annotation pipeline that integrates an advanced attention-based image captioning system with LoRA-based fine-tuning. Our approach not only automates the annotation process but also optimizes training by reducing the parameter space, leading to more efficient and scalable fine-tuning. We validated our approach using a custom dataset of newly introduced customized content, demonstrating significant improvements in both annotation efficiency and image generation quality. This work extends the capabilities of automated fine-tuning in generative models, providing a reliable solution for artificial content creation.
DOI:10.1109/ICVISP64524.2024.10959379