Relation-aware Hierarchical Attention Framework for Video Question Answering

Video Question Answering (VideoQA) is a challenging video understanding task since it requires a deep understanding of both question and video. Previous studies mainly focus on extracting sophisticated visual and language embeddings, fusing them by delicate hand-crafted networks. However, the releva...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Li, Fangtao, Bai, Ting, Cao, Chenyu, Liu, Zihe, Yan, Chenghao, Wu, Bin
Format	Paper Journal Article
Language	English
Published	Ithaca Cornell University Library, arXiv.org 14.05.2021
Subjects	Coders Computer Science - Artificial Intelligence Computer Science - Computer Vision and Pattern Recognition Feature extraction Frames (data processing) Object recognition Questions Video
Online Access	Get full text

Cover

Loading…

Be the first to leave a comment!