Text pedestrian re-identification method based on cross-modal semantic alignment

The invention discloses a text pedestrian re-identification method based on cross-modal semantic alignment, and the method comprises the steps: 1, inputting a text statement and a pedestrian image into a text encoder and a visual encoder respectively, and obtaining a text feature and an image featur...

Full description

Saved in:

Bibliographic Details
Main Authors	ZHAO GUOZHI, GAN WENJUN, WU YONG, HE QIANG, LIU JIAWEI
Format	Patent
Language	Chinese English
Published	24.05.2024
Subjects	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The invention discloses a text pedestrian re-identification method based on cross-modal semantic alignment, and the method comprises the steps: 1, inputting a text statement and a pedestrian image into a text encoder and a visual encoder respectively, and obtaining a text feature and an image feature; 2, clustering the text features and the image features by using a token clustering learning module to obtain clustered text and visual part prototype features; 3, aligning the prototype features of the clustered text and visual part by introducing an optimal transmission algorithm, and further calculating a loss function of the aligned prototype features for updating a text encoder and a visual encoder; and 4, calculating a similarity matrix of the text prototype features and the visual prototype features, and obtaining a retrieval result. According to the method, key information with discriminability in the pedestrian image and the text description can be effectively extracted, and the text and the image modali
Bibliography:	Application Number: CN202410072896