ERTNet: an interpretable transformer-based framework for EEG emotion recognition

Emotion recognition using EEG signals enables clinicians to assess patients' emotional states with precision and immediacy. However, the complexity of EEG signal data poses challenges for traditional recognition methods. Deep learning techniques effectively capture the nuanced emotional cues wi...

Full description

Saved in:

Bibliographic Details
Published in	Frontiers in neuroscience Vol. 18; p. 1320645
Main Authors	Liu, Ruixiang, Chao, Yihu, Ma, Xuerui, Sha, Xianzheng, Sun, Limin, Li, Shuo, Chang, Shijie
Format	Journal Article
Language	English
Published	Switzerland Frontiers Media S.A 2024
Subjects	deep learning EEG emotion recognition interpretability Neuroscience transformer transformer deep learning interpretability emotion recognition EEG
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Emotion recognition using EEG signals enables clinicians to assess patients' emotional states with precision and immediacy. However, the complexity of EEG signal data poses challenges for traditional recognition methods. Deep learning techniques effectively capture the nuanced emotional cues within these signals by leveraging extensive data. Nonetheless, most deep learning techniques lack interpretability while maintaining accuracy. We developed an interpretable end-to-end EEG emotion recognition framework rooted in the hybrid CNN and transformer architecture. Specifically, temporal convolution isolates salient information from EEG signals while filtering out potential high-frequency noise. Spatial convolution discerns the topological connections between channels. Subsequently, the transformer module processes the feature maps to integrate high-level spatiotemporal features, enabling the identification of the prevailing emotional state. Experiments' results demonstrated that our model excels in diverse emotion classification, achieving an accuracy of 74.23% ± 2.59% on the dimensional model (DEAP) and 67.17% ± 1.70% on the discrete model (SEED-V). These results surpass the performances of both CNN and LSTM-based counterparts. Through interpretive analysis, we ascertained that the beta and gamma bands in the EEG signals exert the most significant impact on emotion recognition performance. Notably, our model can independently tailor a Gaussian-like convolution kernel, effectively filtering high-frequency noise from the input EEG data. Given its robust performance and interpretative capabilities, our proposed framework is a promising tool for EEG-driven emotion brain-computer interface.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Reviewed by: Man Fai Leung, Anglia Ruskin University, United Kingdom Jiahui Pan, South China Normal University, China Edited by: Zhao Lv, Anhui University, China These authors have contributed equally to this work and share first authorship
ISSN:	1662-4548 1662-453X 1662-453X
DOI:	10.3389/fnins.2024.1320645