Relation-aware Hierarchical Attention Framework for Video Question Answering

Video Question Answering (VideoQA) is a challenging video understanding task since it requires a deep understanding of both question and video. Previous studies mainly focus on extracting sophisticated visual and language embeddings, fusing them by delicate hand-crafted networks. However, the releva...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Li, Fangtao, Bai, Ting, Cao, Chenyu, Liu, Zihe, Yan, Chenghao, Wu, Bin
Format Paper Journal Article
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 14.05.2021
Subjects
Online AccessGet full text

Cover

Loading…