LPC-GAN for Speech Super-Resolution

Up to today telephone speech lacks of perceptual quality and intelligibility due to bandwidth removal and quantisation artefacts in the encoding process. Super-resolution artificially regenerates this missing frequency content and thus improves the perceptual quality and intelligibility. This work p...

Full description

Saved in:
Bibliographic Details
Published in2023 31st European Signal Processing Conference (EUSIPCO) pp. 346 - 350
Main Authors Schmidt, Konstantin, Edler, Bernd, Mahmoud, Ahmed Mustafa, Fuchs, Guillaume
Format Conference Proceeding
LanguageEnglish
Published EURASIP 04.09.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Up to today telephone speech lacks of perceptual quality and intelligibility due to bandwidth removal and quantisation artefacts in the encoding process. Super-resolution artificially regenerates this missing frequency content and thus improves the perceptual quality and intelligibility. This work proposes a novel approaches for super-resolution based on generative adversarial networks with convolutional architectures. Motivated by the source-filter model of the human speech production, the proposed system decomposes the speech signal into spectral envelope and excitation signal. The missing frequency-content of envelope and excitation are restored with dedicated networks. The network restoring the excitation signal is trained such that there is no mismatch between the excitation signal and the envelope. By this, we achieve better perceptual quality at lower computational complexity.
ISSN:2076-1465
DOI:10.23919/EUSIPCO58844.2023.10290032