NEXUS: Neural cross-modal expression with subject-unified synthesis for brain–vision–language decoding

Visual information decoding from EEG signals presents significant challenges in brain–computer interface research, particularly when addressing cross-subject variability and multi-modal expression. Current approaches often struggle with subject-specific neural patterns and typically focus on single-...

Full description

Saved in:
Bibliographic Details
Published inBiomedical signal processing and control Vol. 112; p. 108400
Main Authors Jin, Xiao, Wang, Yongxiong, Huang, Shuai, Du, Yukun, Zhang, Nan
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.02.2026
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Visual information decoding from EEG signals presents significant challenges in brain–computer interface research, particularly when addressing cross-subject variability and multi-modal expression. Current approaches often struggle with subject-specific neural patterns and typically focus on single-task objectives, which limits their practical applications. We propose NEXUS, a comprehensive framework that integrates subject-specific adaptations with multi-task learning for brain–vision–language decoding. NEXUS introduces a novel subject adaptation layer that processes EEG signals before branching into specialized spatial and temporal pathways, effectively capturing individual neural characteristics while maintaining architectural efficiency. Beyond the traditional classification and retrieval capabilities, our framework extends to text caption generation and image reconstruction, enabling richer interpretations of neural activity. The dual-task approach combining contrastive learning with matching prediction is further enhanced by cross-modal generation objectives, creating a synergistic learning environment where each task reinforces the others. Experimental results on the Things-EEG2 dataset demonstrate that NEXUS significantly outperforms existing methods in zero-shot classification and retrieval tasks, while also producing coherent text descriptions and visually meaningful image reconstructions from EEG signals. Most notably, our approach shows substantial improvements in cross-subject scenarios, reducing the performance gap between subject-dependent and subject-independent conditions. These advances mark an important step toward practical brain–computer interfaces that can effectively decode and express neural activity across different individuals and modalities.
ISSN:1746-8094
DOI:10.1016/j.bspc.2025.108400