A Survey On Image Captioning

Image captioning is a challenging task and attracted a lot of research which is ongoing in the field of computer vision. In image captioning, we use natural language processing along with computer vision to produce the captions. Majority of the papers reviewed for this survey paper use the encoder a...

Full description

Saved in:

Bibliographic Details
Published in	2022 International Conference on Emerging Trends in Smart Technologies (ICETST) pp. 1 - 6
Main Authors	Osaid, Muhammad, Memon, Zulfiqar Ali
Format	Conference Proceeding
Language	English
Published	IEEE 23.09.2022
Subjects	CNN(convolutional neural network) decoder encoder image captioning LSTM(long short-term memory) RNN(recurrent neural network) supervised unsupervised
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Image captioning is a challenging task and attracted a lot of research which is ongoing in the field of computer vision. In image captioning, we use natural language processing along with computer vision to produce the captions. Majority of the papers reviewed for this survey paper use the encoder and decoder framework, but there are lot of other techniques, like supervised and unsupervised image captioning. There is also a technique called scene graph alignment Which is used for unsupervised captioning. Some authors simply reconstructed older procedures rather than using the new ones in order to produce better results. The majority of publications utilised RNN or LSTM as the encoder and CNN as the decoder. Many centred graphs were also employed by authors to rebuild several sentences from it. Another strategy is known as the actor and critic model, in which the actor performs some task and the critic offers criticism to help the results.
DOI:	10.1109/ICETST55735.2022.9922935