LPC-GAN for Speech Super-Resolution

Up to today telephone speech lacks of perceptual quality and intelligibility due to bandwidth removal and quantisation artefacts in the encoding process. Super-resolution artificially regenerates this missing frequency content and thus improves the perceptual quality and intelligibility. This work p...

Full description

Saved in:

Bibliographic Details
Published in	2023 31st European Signal Processing Conference (EUSIPCO) pp. 346 - 350
Main Authors	Schmidt, Konstantin, Edler, Bernd, Mahmoud, Ahmed Mustafa, Fuchs, Guillaume
Format	Conference Proceeding
Language	English
Published	EURASIP 04.09.2023
Subjects	artificial bandwidth expansion audio superresolution Bandwidth bandwidth extension Generative adversarial networks Quantization (signal) Speech coding Speech enhancement Speech recognition speech super-resolution Superresolution
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Up to today telephone speech lacks of perceptual quality and intelligibility due to bandwidth removal and quantisation artefacts in the encoding process. Super-resolution artificially regenerates this missing frequency content and thus improves the perceptual quality and intelligibility. This work proposes a novel approaches for super-resolution based on generative adversarial networks with convolutional architectures. Motivated by the source-filter model of the human speech production, the proposed system decomposes the speech signal into spectral envelope and excitation signal. The missing frequency-content of envelope and excitation are restored with dedicated networks. The network restoring the excitation signal is trained such that there is no mismatch between the excitation signal and the envelope. By this, we achieve better perceptual quality at lower computational complexity.
ISSN:	2076-1465
DOI:	10.23919/EUSIPCO58844.2023.10290032