An audiovisual cognitive optimization strategy guided by salient object ranking for intelligent visual prothesis systems

Objective. Visual prostheses are effective tools for restoring vision, yet real-world complexities pose ongoing challenges. The progress in AI has led to the emergence of the concept of intelligent visual prosthetics with auditory support, leveraging deep learning to create practical artificial visi...

Full description

Saved in:

Bibliographic Details
Published in	Journal of neural engineering Vol. 21; no. 6; pp. 66021 - 66041
Main Authors	Liang, Junling, Li, Heng, Chai, Xinyu, Gao, Qi, Zhou, Meixuan, Guo, Tianruo, Chen, Yao, Di, Liqing
Format	Journal Article
Language	English
Published	England IOP Publishing 01.12.2024
Subjects	Adult Algorithms Artificial Intelligence Attention - physiology audiovisual cognition for prosthetic vision Auditory Perception - physiology Cognition - physiology Deep Learning Depth Perception - physiology Female Humans image semantic description intelligent visual prosthesis Male Photic Stimulation - methods prior knowledge Prosthesis Design - methods salient object ranking Visual Perception - physiology Visual Prosthesis image semantic description intelligent visual prosthesis audiovisual cognition for prosthetic vision salient object ranking prior knowledge
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Objective. Visual prostheses are effective tools for restoring vision, yet real-world complexities pose ongoing challenges. The progress in AI has led to the emergence of the concept of intelligent visual prosthetics with auditory support, leveraging deep learning to create practical artificial vision perception beyond merely restoring natural sight for the blind. Approach. This study introduces an object-based attention mechanism that simulates human gaze points when observing the external world to descriptions of physical regions. By transforming this mechanism into a ranking problem of salient entity regions, we introduce prior visual attention cues to build a new salient object ranking (SaOR) dataset, and propose a SaOR network aimed at providing depth perception for prosthetic vision. Furthermore, we propose a SaOR-guided image description method to align with human observation patterns, toward providing additional visual information by auditory feedback. Finally, the integration of the two aforementioned algorithms constitutes an audiovisual cognitive optimization strategy for prosthetic vision. Main results. Through conducting psychophysical experiments based on scene description tasks under simulated prosthetic vision, we verify that the SaOR method improves the subjects’ performance in terms of object identification and understanding the correlation among objects. Additionally, the cognitive optimization strategy incorporating image description further enhances their prosthetic visual cognition. Significance. This offers valuable technical insights for designing next-generation intelligent visual prostheses and establishes a theoretical groundwork for developing their visual information processing strategies. Code will be made publicly available.
Bibliography:	JNE-107641.R1 ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1741-2560 1741-2552 1741-2552
DOI:	10.1088/1741-2552/ad94a4