GreenStableYolo: Optimizing Inference Time and Image Quality of Text-to-Image Generation

Tuning the parameters and prompts for improving AI-based text-to-image generation has remained a substantial yet unaddressed challenge. Hence we introduce GreenStableYolo, which improves the parameters and prompts for Stable Diffusion to both reduce GPU inference time and increase image generation q...

Full description

Saved in:
Bibliographic Details
Main Authors Gong, Jingzhi, Li, Sisi, d'Aloisio, Giordano, Ding, Zishuo, Ye, Yulong, Langdon, William B, Sarro, Federica
Format Journal Article
LanguageEnglish
Published 20.07.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Tuning the parameters and prompts for improving AI-based text-to-image generation has remained a substantial yet unaddressed challenge. Hence we introduce GreenStableYolo, which improves the parameters and prompts for Stable Diffusion to both reduce GPU inference time and increase image generation quality using NSGA-II and Yolo. Our experiments show that despite a relatively slight trade-off (18%) in image quality compared to StableYolo (which only considers image quality), GreenStableYolo achieves a substantial reduction in inference time (266% less) and a 526% higher hypervolume, thereby advancing the state-of-the-art for text-to-image generation.
DOI:10.48550/arxiv.2407.14982