Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

Top-down visual attention mechanisms have been used extensively in image captioning and visual question answering (VQA) to enable deeper image understanding through fine-grained analysis and even multiple steps of reasoning. In this work, we propose a combined bottom-up and top-down attention mechan...

Full description

Saved in:

Bibliographic Details
Published in	2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 6077 - 6086
Main Authors	Anderson, Peter, He, Xiaodong, Buehler, Chris, Teney, Damien, Johnson, Mark, Gould, Stephen, Zhang, Lei
Format	Conference Proceeding
Language	English
Published	IEEE 01.06.2018
Subjects	Context modeling Mathematical model Object detection Proposals Servers Task analysis Visualization
Online Access	Get full text

Cover

Loading…

Be the first to leave a comment!