An audiovisual cognitive optimization strategy guided by salient object ranking for intelligent visual prothesis systems

Objective. Visual prostheses are effective tools for restoring vision, yet real-world complexities pose ongoing challenges. The progress in AI has led to the emergence of the concept of intelligent visual prosthetics with auditory support, leveraging deep learning to create practical artificial visi...

Full description

Saved in:
Bibliographic Details
Published inJournal of neural engineering Vol. 21; no. 6; pp. 66021 - 66041
Main Authors Liang, Junling, Li, Heng, Chai, Xinyu, Gao, Qi, Zhou, Meixuan, Guo, Tianruo, Chen, Yao, Di, Liqing
Format Journal Article
LanguageEnglish
Published England IOP Publishing 01.12.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Objective. Visual prostheses are effective tools for restoring vision, yet real-world complexities pose ongoing challenges. The progress in AI has led to the emergence of the concept of intelligent visual prosthetics with auditory support, leveraging deep learning to create practical artificial vision perception beyond merely restoring natural sight for the blind. Approach. This study introduces an object-based attention mechanism that simulates human gaze points when observing the external world to descriptions of physical regions. By transforming this mechanism into a ranking problem of salient entity regions, we introduce prior visual attention cues to build a new salient object ranking (SaOR) dataset, and propose a SaOR network aimed at providing depth perception for prosthetic vision. Furthermore, we propose a SaOR-guided image description method to align with human observation patterns, toward providing additional visual information by auditory feedback. Finally, the integration of the two aforementioned algorithms constitutes an audiovisual cognitive optimization strategy for prosthetic vision. Main results. Through conducting psychophysical experiments based on scene description tasks under simulated prosthetic vision, we verify that the SaOR method improves the subjects’ performance in terms of object identification and understanding the correlation among objects. Additionally, the cognitive optimization strategy incorporating image description further enhances their prosthetic visual cognition. Significance. This offers valuable technical insights for designing next-generation intelligent visual prostheses and establishes a theoretical groundwork for developing their visual information processing strategies. Code will be made publicly available.
Bibliography:JNE-107641.R1
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1741-2560
1741-2552
1741-2552
DOI:10.1088/1741-2552/ad94a4