Text pedestrian re-identification method based on cross-modal semantic alignment

The invention discloses a text pedestrian re-identification method based on cross-modal semantic alignment, and the method comprises the steps: 1, inputting a text statement and a pedestrian image into a text encoder and a visual encoder respectively, and obtaining a text feature and an image featur...

Full description

Saved in:
Bibliographic Details
Main Authors ZHAO GUOZHI, GAN WENJUN, WU YONG, HE QIANG, LIU JIAWEI
Format Patent
LanguageChinese
English
Published 24.05.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The invention discloses a text pedestrian re-identification method based on cross-modal semantic alignment, and the method comprises the steps: 1, inputting a text statement and a pedestrian image into a text encoder and a visual encoder respectively, and obtaining a text feature and an image feature; 2, clustering the text features and the image features by using a token clustering learning module to obtain clustered text and visual part prototype features; 3, aligning the prototype features of the clustered text and visual part by introducing an optimal transmission algorithm, and further calculating a loss function of the aligned prototype features for updating a text encoder and a visual encoder; and 4, calculating a similarity matrix of the text prototype features and the visual prototype features, and obtaining a retrieval result. According to the method, key information with discriminability in the pedestrian image and the text description can be effectively extracted, and the text and the image modali
Bibliography:Application Number: CN202410072896