Machine Learning based approach to Image Description for the Visually Impaired

The paper, supported with a review of the literature on image description technologies, introduces an alternative approach that can automatically generate audio description of images, which can greatly support the visually impaired. The need for the paper has arisen from the reality that, the intera...

Full description

Saved in:

Bibliographic Details
Published in	2021 Asian Conference on Innovation in Technology (ASIANCON) pp. 1 - 6
Main Authors	Vrindavanam, Javavrinda, Srinath, Raghunandan, Fathima, Anisa, Arpitha, S., Rao, Chaitanya S, Kavya, T.
Format	Conference Proceeding
Language	English
Published	IEEE 27.08.2021
Subjects	Bahdanau attention model CNN Computational modeling Computer vision Decoding Feature extraction GRU-RNN Image Description Internet Machine learning Resnet - V2 Technological innovation text description text-to-speech converter
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The paper, supported with a review of the literature on image description technologies, introduces an alternative approach that can automatically generate audio description of images, which can greatly support the visually impaired. The need for the paper has arisen from the reality that, the interaction points for visually impaired is getting constrained in an increasingly digitised environment and accessing digital medium through an image describer can be an enabler for the visually impaired. The images which are unseen by the visually impaired are processed, suitable description are generated and converted to a voice output. As against the standard methods like Computer Vision and Convolutional Neural Networks (CNN), the paper makes use of Inception Resnet - V2 model as the feature extractor and decoder (GRU-RNN) along with Bahdanau attention model to generate the text description of the image which is finally converted to an audio using Google Text-to-Speech converter. The results found to be more accurate and accordingly can be supportive in accessing the digital medium for the visually impaired.
DOI:	10.1109/ASIANCON51346.2021.9544867