TEXT TO IMAGE GENERATION USING K-NEAREST-NEIGHBOR DIFFUSION
A method and system for text-to-image generation using a KNN-diffusion model. The method includes receiving a text input, and determining K nearest image embeddings to text embeddings of the text input from a dataset in an embedding space, the embedding space being, for example, a CLIP embedding spa...
Saved in:
Main Authors | , , , , , , |
---|---|
Format | Patent |
Language | English French German |
Published |
18.09.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | A method and system for text-to-image generation using a KNN-diffusion model. The method includes receiving a text input, and determining K nearest image embeddings to text embeddings of the text input from a dataset in an embedding space, the embedding space being, for example, a CLIP embedding space. The method also includes concatenating the text embedding and the K nearest image embeddings. The method also includes mapping the concatenated embeddings into a feature space and generating an image associated with the input text based on the feature space. The feature space being, for example, a joint multi-modal text-image space. |
---|---|
Bibliography: | Application Number: EP20240155143 |