An image caption method based on object detection
How to represent image information more effectively is the key to the task of image caption. In the existing research, a large number of image caption methods are proposed. Most of them use the global information of the image, and the information in the image that is not related to the caption gener...
Saved in:
Published in | Multimedia tools and applications Vol. 78; no. 24; pp. 35329 - 35350 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
New York
Springer US
01.12.2019
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | How to represent image information more effectively is the key to the task of image caption. In the existing research, a large number of image caption methods are proposed. Most of them use the global information of the image, and the information in the image that is not related to the caption generation also participates in the calculation, caused a certain amount of waste of resources. In order to solve this problem, a method of generating image caption based on object detection is proposed in this paper. Firstly, the object detection algorithm is used to extract image feature, only the features of meaningful regions in the image are used, and then image caption is generated by combining the spatial attention mechanism with the caption generation network. Experiments show that the image feature of the object region and the salient region are sufficient to represent the information of the entire image in the image caption task. For better convergence of the model, this paper also uses a new strategy for model training. The experimental results show that the proposed model in this paper work well on the test dataset of image caption, and it has created a precedent for new technology to a large extent. |
---|---|
ISSN: | 1380-7501 1573-7721 |
DOI: | 10.1007/s11042-019-08116-9 |