Video action content understanding method and device, equipment and medium
The invention discloses a video action content understanding method and device, equipment and a medium. The method comprises the following steps: acquiring preprocessed video data; acquiring question information related to video action content, extracting image features and text features of the prep...
Saved in:
Main Authors | , , |
---|---|
Format | Patent |
Language | Chinese English |
Published |
03.05.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The invention discloses a video action content understanding method and device, equipment and a medium. The method comprises the following steps: acquiring preprocessed video data; acquiring question information related to video action content, extracting image features and text features of the preprocessed video data and question information on the basis of an ActionCLIP model, and performing similarity matching calculation on the image features and the text features on the basis of a similarity matching strategy to obtain a maximum probability text output result; extracting spatial features and time features of the preprocessed video data based on a SlowFast network, and classifying and identifying video action contents based on the spatial and time features to obtain a text output result; matching the maximum probability text output result and the text output result with preset text features to obtain a text matching result; and obtaining a video action understanding result based on a text matching result. |
---|---|
Bibliography: | Application Number: CN202410169801 |