Inv-ReVersion: Enhanced Relation Inversion Based on Text-to-Image Diffusion Models

Diffusion models are widely recognized in image generation for their ability to produce high-quality images from text prompts. As the demand for customized models grows, various methods have emerged to capture appearance features. However, the exploration of relations between entities, another cruci...

Full description

Saved in:

Bibliographic Details
Published in	Applied sciences Vol. 14; no. 8; p. 3338
Main Authors	Zhang, Guangzi, Qian, Yulin, Deng, Juntao, Cai, Xingquan
Format	Journal Article
Language	English
Published	Basel MDPI AG 01.04.2024
Subjects	diffusion models Experiments fine-tuning Learning Methods relation inversion Semantics Semiconductor industry text-to-image United States
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Diffusion models are widely recognized in image generation for their ability to produce high-quality images from text prompts. As the demand for customized models grows, various methods have emerged to capture appearance features. However, the exploration of relations between entities, another crucial aspect of images, has been limited. This study focuses on enabling models to capture and generate high-level semantic images with specific relation concepts, which is a challenging task. To this end, we introduce the Inv-ReVersion framework, which uses inverse relations text expansion to separate the feature fusion of multiple entities in images. Additionally, we employ a weighted contrastive loss to emphasize part of speech, helping the model learn more abstract relation concepts. We also propose a high-frequency suppressor to reduce the time spent on learning low-frequency details, enhancing the model’s ability to generate image relations. Compared to existing baselines, our approach can more accurately generate relation concepts between entities without additional computational costs, especially in capturing abstract relation concepts.
ISSN:	2076-3417 2076-3417
DOI:	10.3390/app14083338