Question image triple structured guided visual question answering method and device

The invention relates to a question image triple structured guided visual question answering method and device. The method comprises the following steps: acquiring a target image and a target question aiming at the target image; extracting a problem global feature, a plurality of problem attribute t...

Full description

Saved in:

Bibliographic Details
Main Authors	HAN ZEFANG, XIE XUEMEI, FANG MIAN, LIU YONG, LI JINHANG
Format	Patent
Language	Chinese English
Published	07.07.2023
Subjects	CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The invention relates to a question image triple structured guided visual question answering method and device. The method comprises the following steps: acquiring a target image and a target question aiming at the target image; extracting a problem global feature, a plurality of problem attribute triple features and a plurality of problem relationship triple features of the target problem by using the first target model; utilizing a second target model to respectively extract a plurality of image attribute triple features of the target image; a target attention model is utilized to determine first relevancy between each image attribute triple feature and each problem attribute triple feature, and a plurality of first relevancy form an attribute attention weight matrix; and splicing the target image attribute triple feature, the target image relationship triple feature and the question global feature, and inputting the spliced features into a target answer classifier to obtain answer information of the target
Bibliography:	Application Number: CN202310261086