Cross-modal guiding and reweighting network for multi-modal RSVP-based target detection

Rapid Serial Visual Presentation (RSVP) based Brain–Computer Interface (BCI) facilities the high-throughput detection of rare target images by detecting evoked event-related potentials (ERPs). At present, the decoding accuracy of the RSVP-based BCI system limits its practical applications. This stud...

Full description

Saved in:

Bibliographic Details
Published in	Neural networks Vol. 161; pp. 65 - 82
Main Authors	Mao, Jiayu, Qiu, Shuang, Wei, Wei, He, Huiguang
Format	Journal Article
Language	English
Published	United States Elsevier Ltd 01.04.2023
Subjects	Brain-Computer Interfaces Brain–computer interface (BCI) Convolutional neural network (CNN) Electroencephalogram (EEG) Electroencephalography - methods Evoked Potentials Humans Multi-modal learning Rapid serial visual presentation (RSVP) Convolutional neural network (CNN) Electroencephalogram (EEG) Multi-modal learning Rapid serial visual presentation (RSVP) Brain–computer interface (BCI)
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Rapid Serial Visual Presentation (RSVP) based Brain–Computer Interface (BCI) facilities the high-throughput detection of rare target images by detecting evoked event-related potentials (ERPs). At present, the decoding accuracy of the RSVP-based BCI system limits its practical applications. This study introduces eye movements (gaze and pupil information), referred to as EYE modality, as another useful source of information to combine with EEG-based BCI and forms a novel target detection system to detect target images in RSVP tasks. We performed an RSVP experiment, recorded the EEG signals and eye movements simultaneously during a target detection task, and constructed a multi-modal dataset including 20 subjects. Also, we proposed a cross-modal guiding and fusion network to fully utilize EEG and EYE modalities and fuse them for better RSVP decoding performance. In this network, a two-branch backbone was built to extract features from these two modalities. A Cross-Modal Feature Guiding (CMFG) module was proposed to guide EYE modality features to complement the EEG modality for better feature extraction. A Multi-scale Multi-modal Reweighting (MMR) module was proposed to enhance the multi-modal features by exploring intra- and inter-modal interactions. And, a Dual Activation Fusion (DAF) was proposed to modulate the enhanced multi-modal features for effective fusion. Our proposed network achieved a balanced accuracy of 88.00% (±2.29) on the collected dataset. The ablation studies and visualizations revealed the effectiveness of the proposed modules. This work implies the effectiveness of introducing the EYE modality in RSVP tasks. And, our proposed network is a promising method for RSVP decoding and further improves the performance of RSVP-based target detection systems. •We design and conduct RSVP experiments to collect EEG and eye movements data.•A cross-modal guiding and reweighting network utilizes multi-modal information.•The proposed network outperforms existing comparable methods in RSVP tasks.•Visualizations and ablation studies verify the effectiveness of the proposed modules.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0893-6080 1879-2782 1879-2782
DOI:	10.1016/j.neunet.2023.01.009