Automode: The Automatic LoRA Based on Stable Diffusion
Stable Diffusion is a powerful tool for image generation but encounters difficulties in generating images of untrained subjects, such as newly introduced customized content. Traditional fine-tuning methods often require the manual creation of datasets, followed by computationally expensive full-para...
Saved in:
Published in | 2024 IEEE 8th International Conference on Vision, Image and Signal Processing (ICVISP) pp. 1 - 5 |
---|---|
Main Authors | , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
27.12.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Stable Diffusion is a powerful tool for image generation but encounters difficulties in generating images of untrained subjects, such as newly introduced customized content. Traditional fine-tuning methods often require the manual creation of datasets, followed by computationally expensive full-parameter training or low-rank adaptation, as seen in LoRA. While LoRA mitigates the cost associated with full-parameter training, traditional methods still face challenges regarding scalability and efficiency. To address these limitations, we introduce a novel automated annotation pipeline that integrates an advanced attention-based image captioning system with LoRA-based fine-tuning. Our approach not only automates the annotation process but also optimizes training by reducing the parameter space, leading to more efficient and scalable fine-tuning. We validated our approach using a custom dataset of newly introduced customized content, demonstrating significant improvements in both annotation efficiency and image generation quality. This work extends the capabilities of automated fine-tuning in generative models, providing a reliable solution for artificial content creation. |
---|---|
DOI: | 10.1109/ICVISP64524.2024.10959379 |