NarrationBot and InfoBot: A Hybrid System for Automated Video Description

Video accessibility is crucial for blind and low vision users for equitable engagements in education, employment, and entertainment. Despite the availability of professional and amateur services and tools, most human-generated descriptions are expensive and time consuming. Moreover, the rate of huma...

Full description

Saved in:

Bibliographic Details
Main Authors	Ihorn, Shasta, Siu, Yue-Ting, Bodi, Aditya, Narins, Lothar, Castanon, Jose M, Kant, Yash, Das, Abhishek, Yoon, Ilmi, Fazli, Pooyan
Format	Journal Article
Language	English
Published	07.11.2021
Subjects	Computer Science - Computer Vision and Pattern Recognition Computer Science - Human-Computer Interaction Computer Science - Learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Video accessibility is crucial for blind and low vision users for equitable engagements in education, employment, and entertainment. Despite the availability of professional and amateur services and tools, most human-generated descriptions are expensive and time consuming. Moreover, the rate of human-generated descriptions cannot match the speed of video production. To overcome the increasing gaps in video accessibility, we developed a hybrid system of two tools to 1) automatically generate descriptions for videos and 2) provide answers or additional descriptions in response to user queries on a video. Results from a mixed-methods study with 26 blind and low vision individuals show that our system significantly improved user comprehension and enjoyment of selected videos when both tools were used in tandem. In addition, participants reported no significant difference in their ability to understand videos when presented with autogenerated descriptions versus human-revised autogenerated descriptions. Our results demonstrate user enthusiasm about the developed system and its promise for providing customized access to videos. We discuss the limitations of the current work and provide recommendations for the future development of automated video description tools.
DOI:	10.48550/arxiv.2111.03994