A Survey On Image Captioning
Image captioning is a challenging task and attracted a lot of research which is ongoing in the field of computer vision. In image captioning, we use natural language processing along with computer vision to produce the captions. Majority of the papers reviewed for this survey paper use the encoder a...
Saved in:
Published in | 2022 International Conference on Emerging Trends in Smart Technologies (ICETST) pp. 1 - 6 |
---|---|
Main Authors | , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
23.09.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Image captioning is a challenging task and attracted a lot of research which is ongoing in the field of computer vision. In image captioning, we use natural language processing along with computer vision to produce the captions. Majority of the papers reviewed for this survey paper use the encoder and decoder framework, but there are lot of other techniques, like supervised and unsupervised image captioning. There is also a technique called scene graph alignment Which is used for unsupervised captioning. Some authors simply reconstructed older procedures rather than using the new ones in order to produce better results. The majority of publications utilised RNN or LSTM as the encoder and CNN as the decoder. Many centred graphs were also employed by authors to rebuild several sentences from it. Another strategy is known as the actor and critic model, in which the actor performs some task and the critic offers criticism to help the results. |
---|---|
DOI: | 10.1109/ICETST55735.2022.9922935 |