Deep Modular Bilinear Attention Network for Visual Question Answering
VQA (Visual Question Answering) is a multi-model task. Given a picture and a question related to the image, it will determine the correct answer. The attention mechanism has become a de facto component of almost all VQA models. Most recent VQA approaches use dot-product to calculate the intra-modali...
Saved in:
Published in | Sensors (Basel, Switzerland) Vol. 22; no. 3; p. 1045 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Switzerland
MDPI AG
28.01.2022
MDPI |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Be the first to leave a comment!