Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

Top-down visual attention mechanisms have been used extensively in image captioning and visual question answering (VQA) to enable deeper image understanding through fine-grained analysis and even multiple steps of reasoning. In this work, we propose a combined bottom-up and top-down attention mechan...

Full description

Saved in:
Bibliographic Details
Published in2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 6077 - 6086
Main Authors Anderson, Peter, He, Xiaodong, Buehler, Chris, Teney, Damien, Johnson, Mark, Gould, Stephen, Zhang, Lei
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.06.2018
Subjects
Online AccessGet full text

Cover

Loading…