Text-conditioned image search based on transformation, aggregation, and composition of visio-linguistic features

Techniques are disclosed for text-conditioned image searching. A methodology implementing the techniques includes decomposing a source image into visual feature vectors associated with different levels of granularity. The method also includes decomposing a text query (defining a target image attribu...

Full description

Saved in:

Bibliographic Details
Main Authors	Chopra, Ayush, Jandial, Surgan, Chawla, Pranit, Badjatiya, Pinkesh, Sarkar, Mausoom
Format	Patent
Language	English
Published	08.08.2023
Subjects	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Techniques are disclosed for text-conditioned image searching. A methodology implementing the techniques includes decomposing a source image into visual feature vectors associated with different levels of granularity. The method also includes decomposing a text query (defining a target image attribute) into feature vectors associated with different levels of granularity including a global text feature vector. The method further includes generating image-text embeddings based on the visual feature vectors and the text feature vectors to encode information from visual and textual features. The method further includes composing a visio-linguistic representation based on a hierarchical aggregation of the image-text embeddings to encode visual and textual information at multiple levels of granularity. The method further includes identifying a target image that includes the visio-linguistic representation and the global text feature vector, so that the target image relates to the target image attribute, and providing the target image as an image search result.
Bibliography:	Application Number: US202117160893