Text pedestrian re-identification method based on cross-modal semantic alignment
The invention discloses a text pedestrian re-identification method based on cross-modal semantic alignment, and the method comprises the steps: 1, inputting a text statement and a pedestrian image into a text encoder and a visual encoder respectively, and obtaining a text feature and an image featur...
Saved in:
Main Authors | , , , , |
---|---|
Format | Patent |
Language | Chinese English |
Published |
24.05.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The invention discloses a text pedestrian re-identification method based on cross-modal semantic alignment, and the method comprises the steps: 1, inputting a text statement and a pedestrian image into a text encoder and a visual encoder respectively, and obtaining a text feature and an image feature; 2, clustering the text features and the image features by using a token clustering learning module to obtain clustered text and visual part prototype features; 3, aligning the prototype features of the clustered text and visual part by introducing an optimal transmission algorithm, and further calculating a loss function of the aligned prototype features for updating a text encoder and a visual encoder; and 4, calculating a similarity matrix of the text prototype features and the visual prototype features, and obtaining a retrieval result. According to the method, key information with discriminability in the pedestrian image and the text description can be effectively extracted, and the text and the image modali |
---|---|
Bibliography: | Application Number: CN202410072896 |