SCENE-BASED TEXT-TO-IMAGE GENERATION WITH HUMAN PRIORS

In one embodiment, a method includes accessing a text input and a scene input corresponding to the text input, wherein the scene input comprises semantic segmentations, generating text tokens for the text input and scene tokens for the scene input by machine-learning models, generating predicted ima...

Full description

Saved in:

Bibliographic Details
Main Authors	GAFNI, Oran, POLYAK, Adam, TAIGMAN, Yaniv Nechemia
Format	Patent
Language	English French
Published	11.07.2024
Subjects	CALCULATING COMPUTING COUNTING PHYSICS
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In one embodiment, a method includes accessing a text input and a scene input corresponding to the text input, wherein the scene input comprises semantic segmentations, generating text tokens for the text input and scene tokens for the scene input by machine-learning models, generating predicted image tokens based on the text tokens and the scene tokens by the machine-learning models, and generating an image corresponding to the text input and the scene input based on the predicted image tokens by the machine-learning models. Dans un mode de réalisation, un procédé comprend les étapes consistant à accéder à une entrée de texte et à une entrée de scène correspondant à l'entrée de texte, l'entrée de scène contenant des segmentations sémantiques, à générer des jetons de texte pour l'entrée de texte et des jetons de scène pour l'entrée de scène par des modèles d'apprentissage automatique, à générer des jetons d'image prédits sur la base des jetons de texte et des jetons de scène par les modèles d'apprentissage automatique, et à générer une image correspondant à l'entrée de texte et à l'entrée de scène sur la base des jetons d'image prédits par les modèles d'apprentissage automatique.
Bibliography:	Application Number: WO2023US84840