EEG-Based Music Emotion Prediction Using Supervised Feature Extraction for MIDI Generation

Advancements in music emotion prediction are driving AI-driven algorithmic composition, enabling the generation of complex melodies. However, bridging neural and auditory domains remains challenging due to the semantic gap between brain-derived low-level features and high-level musical concepts, mak...

Full description

Saved in:

Bibliographic Details
Published in	Sensors (Basel, Switzerland) Vol. 25; no. 5; p. 1471
Main Authors	Gomez-Morales, Oscar, Perez-Nastar, Hernan, Álvarez-Meza, Andrés Marino, Torres-Cardona, Héctor, Castellanos-Dominguez, Germán
Format	Journal Article
Language	English
Published	Switzerland MDPI AG 01.03.2025 MDPI
Subjects	Algorithms Analysis Brain - physiology Brain research Cognition & reasoning Deep Learning EEG Electroencephalography Electroencephalography - methods Emotions Emotions - physiology Fourier transforms Humans kernel methods Medical imaging Music Music - psychology music emotion recognition Neural networks Neuroimaging Neurophysiology Piano piano-roll algorithm Semantics kernel methods piano-roll algorithm EEG music emotion recognition
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Advancements in music emotion prediction are driving AI-driven algorithmic composition, enabling the generation of complex melodies. However, bridging neural and auditory domains remains challenging due to the semantic gap between brain-derived low-level features and high-level musical concepts, making alignment computationally demanding. This study proposes a deep learning framework for generating MIDI sequences aligned with labeled emotion predictions through supervised feature extraction from neural and auditory domains. EEGNet is employed to process neural data, while an autoencoder-based piano algorithm handles auditory data. To address modality heterogeneity, Centered Kernel Alignment is incorporated to enhance the separation of emotional states. Furthermore, regression between feature domains is applied to reduce intra-subject variability in extracted Electroencephalography (EEG) patterns, followed by the clustering of latent auditory representations into denser partitions to improve MIDI reconstruction quality. Using musical metrics, evaluation on real-world data shows that the proposed approach improves emotion classification (namely, between arousal and valence) and the system’s ability to produce MIDI sequences that better preserve temporal alignment, tonal consistency, and structural integrity. Subject-specific analysis reveals that subjects with stronger imagery paradigms produced higher-quality MIDI outputs, as their neural patterns aligned more closely with the training data. In contrast, subjects with weaker performance exhibited auditory data that were less consistent.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1424-8220 1424-8220
DOI:	10.3390/s25051471