Automode: The Automatic LoRA Based on Stable Diffusion

Stable Diffusion is a powerful tool for image generation but encounters difficulties in generating images of untrained subjects, such as newly introduced customized content. Traditional fine-tuning methods often require the manual creation of datasets, followed by computationally expensive full-para...

Full description

Saved in:

Bibliographic Details
Published in	2024 IEEE 8th International Conference on Vision, Image and Signal Processing (ICVISP) pp. 1 - 5
Main Authors	Wu, Yao, Lu, Yifan
Format	Conference Proceeding
Language	English
Published	IEEE 27.12.2024
Subjects	Annotations auto-LoRA automatic annotation Costs customized content creation Faces fine-tuning image captioning Image synthesis Manuals Pipelines Reliability Scalability Signal processing Stable diffusion Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Stable Diffusion is a powerful tool for image generation but encounters difficulties in generating images of untrained subjects, such as newly introduced customized content. Traditional fine-tuning methods often require the manual creation of datasets, followed by computationally expensive full-parameter training or low-rank adaptation, as seen in LoRA. While LoRA mitigates the cost associated with full-parameter training, traditional methods still face challenges regarding scalability and efficiency. To address these limitations, we introduce a novel automated annotation pipeline that integrates an advanced attention-based image captioning system with LoRA-based fine-tuning. Our approach not only automates the annotation process but also optimizes training by reducing the parameter space, leading to more efficient and scalable fine-tuning. We validated our approach using a custom dataset of newly introduced customized content, demonstrating significant improvements in both annotation efficiency and image generation quality. This work extends the capabilities of automated fine-tuning in generative models, providing a reliable solution for artificial content creation.
DOI:	10.1109/ICVISP64524.2024.10959379