Multi-stream feature fusion of vision transformer and CNN for precise epileptic seizure detection from EEG signals
Automated seizure detection based on scalp electroencephalography (EEG) can significantly accelerate the epilepsy diagnosis process. However, most existing deep learning-based epilepsy detection methods are deficient in mining the local features and global time series dependence of EEG signals, limi...
Saved in:
Published in | Journal of translational medicine Vol. 23; no. 1; pp. 871 - 23 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
England
BioMed Central Ltd
06.08.2025
BioMed Central BMC |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Automated seizure detection based on scalp electroencephalography (EEG) can significantly accelerate the epilepsy diagnosis process. However, most existing deep learning-based epilepsy detection methods are deficient in mining the local features and global time series dependence of EEG signals, limiting the performance enhancement of the models in seizure detection.
Our study proposes an epilepsy detection model, CMFViT, based on a Multi-Stream Feature Fusion (MSFF) strategy that fuses a Convolutional Neural Network (CNN) with a Vision Transformer (ViT). The model converts EEG signals into time-frequency domain images using the Tunable Q-factor Wavelet Transform (TQWT), and then utilizes the CNN module and the ViT module to capture local features and global time-series correlations, respectively. It fuses different feature representations through the MSFF strategy to enhance its discriminative ability, and finally completes the classification task through the average pooling layer and the fully connected layer.
The effectiveness of the model was validated by experimental evaluations on the publicly available CHB-MIT dataset and the Kaggle 121 people epilepsy dataset. The model achieved 98.85% classification accuracy and other excellent metrics in single-subject experiments on the CHB-MIT dataset, and also demonstrated strong performance in cross-subject experiments on the Kaggle dataset. Ablation experiments demonstrate the complementary roles of the CNN and ViT modules, and their integration significantly improves detection accuracy and generalization. Comparisons with other methods highlight the advantages of the CMFViT model.
The CMFViT model provides an efficient, accurate, and innovative solution for complex EEG signal analysis and seizure detection tasks for single and cross-subjects while laying the foundation for developing real-time, accurate seizure detection systems. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 1479-5876 1479-5876 |
DOI: | 10.1186/s12967-025-06862-z |