A survey on deep neural network-based image captioning

Image captioning is a hot topic of image understanding, and it is composed of two natural parts (“look” and “language expression”) which correspond to the two most important fields of artificial intelligence (“machine vision” and “natural language processing”). With the development of deep neural ne...

Full description

Saved in:
Bibliographic Details
Published inThe Visual computer Vol. 35; no. 3; pp. 445 - 470
Main Authors Liu, Xiaoxiao, Xu, Qingyang, Wang, Ning
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer Berlin Heidelberg 01.03.2019
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Image captioning is a hot topic of image understanding, and it is composed of two natural parts (“look” and “language expression”) which correspond to the two most important fields of artificial intelligence (“machine vision” and “natural language processing”). With the development of deep neural networks and better labeling database, the image captioning techniques have developed quickly. In this survey, the image captioning approaches and improvements based on deep neural network are introduced, including the characteristics of the specific techniques. The early image captioning approach based on deep neural network is the retrieval-based method. The retrieval method makes use of a searching technique to find an appropriate image description. The template-based method separates the image captioning process into object detection and sentence generation. Recently, end-to-end learning-based image captioning method has been verified effective at image captioning. The end-to-end learning techniques can generate more flexible and fluent sentence. In this survey, the image captioning methods are reviewed in detail. Furthermore, some remaining challenges are discussed.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0178-2789
1432-2315
DOI:10.1007/s00371-018-1566-y