Canvas: Compositional Generation for Art Painting with Seamless Subject-Driven Infusion

While diffusion-based art image synthesis has witnessed great success in terms of quality, there are still deficiencies in integrating artist-specified subjects with artistic style. In this paper, we propose Canvas , a framework that leverages the capabilities of text-guided latent diffusion models...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on circuits and systems for video technology p. 1
Main Authors	Wang, Yunnan, Li, Ziqiang, Zhang, Wenyao, Lv, Lexiang, Zhang, Zequn, Shen, Xiaoyu, Jin, Xin, Zeng, Wenjun
Format	Journal Article
Language	English
Published	IEEE 2025
Subjects	Art image generation Circuits and systems Diffusion models Electrical impedance tomography Image synthesis latent diffusion model Noise reduction subject-driven image composition Sun Text to image Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	While diffusion-based art image synthesis has witnessed great success in terms of quality, there are still deficiencies in integrating artist-specified subjects with artistic style. In this paper, we propose Canvas , a framework that leverages the capabilities of text-guided latent diffusion models (LDMs) for flexible art image composition driven by diverse customized subject concepts. Specifically, we start by collecting art images manually drawn by proficient artists and annotating the corresponding subject concepts, forming the CreaCulture dataset. Based on this dataset, we build our Canvas with two generation stages. Firstly, a stable diffusion-based stylistic LDM is fine-tuned on the original CreaCulture dataset, aiming to generate an art-style background with annotated subject concepts. To alleviate the limited scope of tagged subject concepts, we propose nature-to-art (N2A) transition to expand the CreaCulture using the natural/art concepts from pre-trained/stylistic LDM, facilitating the fine-tuning of the tailor-made concept-derived LDM. Additionally, the Subject-Infused Attention (SIA) is integrated into the concept-derived LDM, which seamlessly composites the user-specified natural foreground with the pre-generated art background image in a training-free manner. Extensive experiments demonstrate that Canvas outperforms state-of-the-art alternatives under the setting of art image synthesis. The code and dataset are available at https://github.com/wangyunnan/Canvas.
ISSN:	1051-8215 1558-2205
DOI:	10.1109/TCSVT.2025.3587757