A Survey On Image Captioning

Image captioning is a challenging task and attracted a lot of research which is ongoing in the field of computer vision. In image captioning, we use natural language processing along with computer vision to produce the captions. Majority of the papers reviewed for this survey paper use the encoder a...

Full description

Saved in:
Bibliographic Details
Published in2022 International Conference on Emerging Trends in Smart Technologies (ICETST) pp. 1 - 6
Main Authors Osaid, Muhammad, Memon, Zulfiqar Ali
Format Conference Proceeding
LanguageEnglish
Published IEEE 23.09.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Image captioning is a challenging task and attracted a lot of research which is ongoing in the field of computer vision. In image captioning, we use natural language processing along with computer vision to produce the captions. Majority of the papers reviewed for this survey paper use the encoder and decoder framework, but there are lot of other techniques, like supervised and unsupervised image captioning. There is also a technique called scene graph alignment Which is used for unsupervised captioning. Some authors simply reconstructed older procedures rather than using the new ones in order to produce better results. The majority of publications utilised RNN or LSTM as the encoder and CNN as the decoder. Many centred graphs were also employed by authors to rebuild several sentences from it. Another strategy is known as the actor and critic model, in which the actor performs some task and the critic offers criticism to help the results.
DOI:10.1109/ICETST55735.2022.9922935