Review of Image Captioning Methods Based on Encoding-Decoding Technology
In recent years, image caption generation, as a multimodal task in the field of artificial intelligence, integrates the related research of computer vision and natural language processing, and can realize the modal conversion from image to text. It plays an important role in visual assistance and im...
Saved in:
Published in | Jisuanji kexue yu tansuo Vol. 16; no. 10; pp. 2234 - 2248 |
---|---|
Main Author | |
Format | Journal Article |
Language | Chinese |
Published |
Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press
01.10.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In recent years, image caption generation, as a multimodal task in the field of artificial intelligence, integrates the related research of computer vision and natural language processing, and can realize the modal conversion from image to text. It plays an important role in visual assistance and image understanding, and has attracted extensive attention from researchers. Firstly, this paper describes the task of image caption generation, and introduces three image caption generation methods: template-based method, retrieval-based method and encode-decode method. Their respective method ideas, representative research and advantages and disadvantages are also introduced. Secondly, from the model structure, the research progress of image understanding phase and caption generation phase, this paper expounds in detail the method based on encoding-decoding, and summarizes the research over years into the research of image understanding and caption generation. Image understanding research includes attention mechani |
---|---|
ISSN: | 1673-9418 |
DOI: | 10.3778/j.issn.1673-9418.2112080 |