Question image triple structured guided visual question answering method and device

The invention relates to a question image triple structured guided visual question answering method and device. The method comprises the following steps: acquiring a target image and a target question aiming at the target image; extracting a problem global feature, a plurality of problem attribute t...

Full description

Saved in:
Bibliographic Details
Main Authors HAN ZEFANG, XIE XUEMEI, FANG MIAN, LIU YONG, LI JINHANG
Format Patent
LanguageChinese
English
Published 07.07.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The invention relates to a question image triple structured guided visual question answering method and device. The method comprises the following steps: acquiring a target image and a target question aiming at the target image; extracting a problem global feature, a plurality of problem attribute triple features and a plurality of problem relationship triple features of the target problem by using the first target model; utilizing a second target model to respectively extract a plurality of image attribute triple features of the target image; a target attention model is utilized to determine first relevancy between each image attribute triple feature and each problem attribute triple feature, and a plurality of first relevancy form an attribute attention weight matrix; and splicing the target image attribute triple feature, the target image relationship triple feature and the question global feature, and inputting the spliced features into a target answer classifier to obtain answer information of the target
Bibliography:Application Number: CN202310261086