Question image triple structured guided visual question answering method and device
The invention relates to a question image triple structured guided visual question answering method and device. The method comprises the following steps: acquiring a target image and a target question aiming at the target image; extracting a problem global feature, a plurality of problem attribute t...
Saved in:
Main Authors | , , , , |
---|---|
Format | Patent |
Language | Chinese English |
Published |
07.07.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The invention relates to a question image triple structured guided visual question answering method and device. The method comprises the following steps: acquiring a target image and a target question aiming at the target image; extracting a problem global feature, a plurality of problem attribute triple features and a plurality of problem relationship triple features of the target problem by using the first target model; utilizing a second target model to respectively extract a plurality of image attribute triple features of the target image; a target attention model is utilized to determine first relevancy between each image attribute triple feature and each problem attribute triple feature, and a plurality of first relevancy form an attribute attention weight matrix; and splicing the target image attribute triple feature, the target image relationship triple feature and the question global feature, and inputting the spliced features into a target answer classifier to obtain answer information of the target |
---|---|
Bibliography: | Application Number: CN202310261086 |