Video action content understanding method and device, equipment and medium

The invention discloses a video action content understanding method and device, equipment and a medium. The method comprises the following steps: acquiring preprocessed video data; acquiring question information related to video action content, extracting image features and text features of the prep...

Full description

Saved in:
Bibliographic Details
Main Authors LU XINKAI, GU JIAXIN, SHEN XIONG
Format Patent
LanguageChinese
English
Published 03.05.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The invention discloses a video action content understanding method and device, equipment and a medium. The method comprises the following steps: acquiring preprocessed video data; acquiring question information related to video action content, extracting image features and text features of the preprocessed video data and question information on the basis of an ActionCLIP model, and performing similarity matching calculation on the image features and the text features on the basis of a similarity matching strategy to obtain a maximum probability text output result; extracting spatial features and time features of the preprocessed video data based on a SlowFast network, and classifying and identifying video action contents based on the spatial and time features to obtain a text output result; matching the maximum probability text output result and the text output result with preset text features to obtain a text matching result; and obtaining a video action understanding result based on a text matching result.
Bibliography:Application Number: CN202410169801