Speech Synthesis from Stereotactic EEG using an Electrode Shaft Dependent Multi-Input Convolutional Neural Network Approach

Neurological disorders can lead to significant impairments in speech communication and, in severe cases, cause the complete loss of the ability to speak. Brain-Computer Interfaces have shown promise as an alternative communication modality by directly transforming neural activity of speech processes...

Full description

Saved in:
Bibliographic Details
Published in2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) Vol. 2021; pp. 6045 - 6048
Main Authors Angrick, Miguel, Ottenhoff, Maarten, Goulis, Sophocles, Colon, Albert J., Wagner, Louis, Krusienski, Dean J., Kubben, Pieter L., Schultz, Tanja, Herff, Christian
Format Conference Proceeding Journal Article
LanguageEnglish
Published United States IEEE 01.11.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Neurological disorders can lead to significant impairments in speech communication and, in severe cases, cause the complete loss of the ability to speak. Brain-Computer Interfaces have shown promise as an alternative communication modality by directly transforming neural activity of speech processes into a textual or audible representations. Previous studies investigating such speech neuroprostheses relied on electrocorticography (ECoG) or microelectrode arrays that acquire neural signals from superficial areas on the cortex. While both measurement methods have demonstrated successful speech decoding, they do not capture activity from deeper brain structures and this activity has therefore not been harnessed for speech-related BCIs. In this study, we bridge this gap by adapting a previously presented decoding pipeline for speech synthesis based on ECoG signals to implanted depth electrodes (sEEG). For this purpose, we propose a multi-input convolutional neural network that extracts speech-related activity separately for each electrode shaft and estimates spectral coefficients to reconstruct an audible waveform. We evaluate our approach on open-loop data from 5 patients who conducted a recitation task of Dutch utterances. We achieve correlations of up to 0.80 between original and reconstructed speech spectrograms, which are significantly above chance level for all patients (p < 0.001). Our results indicate that sEEG can yield similar speech decoding performance to prior ECoG studies and is a promising modality for speech BCIs.
ISSN:2694-0604
DOI:10.1109/EMBC46164.2021.9629711