TEXT TO IMAGE GENERATION USING K-NEAREST-NEIGHBOR DIFFUSION

A method and system for text-to-image generation using a KNN-diffusion model. The method includes receiving a text input, and determining K nearest image embeddings to text embeddings of the text input from a dataset in an embedding space, the embedding space being, for example, a CLIP embedding spa...

Full description

Saved in:
Bibliographic Details
Main Authors Polyak, Adam, Gafni, Oran, Singer, Uriel, Nachmani, Eliya, Taigman, Yaniv Nechemia, Sheynin, Shelly, Ashual, Oron
Format Patent
LanguageEnglish
French
German
Published 18.09.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A method and system for text-to-image generation using a KNN-diffusion model. The method includes receiving a text input, and determining K nearest image embeddings to text embeddings of the text input from a dataset in an embedding space, the embedding space being, for example, a CLIP embedding space. The method also includes concatenating the text embedding and the K nearest image embeddings. The method also includes mapping the concatenated embeddings into a feature space and generating an image associated with the input text based on the feature space. The feature space being, for example, a joint multi-modal text-image space.
Bibliography:Application Number: EP20240155143