Exploring Sparse Spatial Relation in Graph Inference for Text-Based VQA

Text-based visual question answering (TextVQA) faces the significant challenge of avoiding redundant relational inference. To be specific, a large number of detected objects and optical character recognition (OCR) tokens result in rich visual relationships. Existing works take all visual relationshi...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on image processing Vol. 32; pp. 5060 - 5074
Main Authors Zhou, Sheng, Guo, Dan, Li, Jia, Yang, Xun, Wang, Meng
Format Journal Article
LanguageEnglish
Published New York IEEE 2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…