Controlling the Perceived Sound Quality for Dialogue Enhancement With Deep Learning

Speech enhancement attenuates interfering sounds in speech signals but may introduce artifacts that perceivably deteriorate the output signal. We propose a method for controlling the trade-off between the attenuation of the interfering background signal and the loss of sound quality. A deep neural n...

Full description

Saved in:

Bibliographic Details
Published in	ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 51 - 55
Main Authors	Uhle, Christian, Torcoli, Matteo, Paulus, Jouni
Format	Conference Proceeding
Language	English
Published	IEEE 01.05.2020
Subjects	Artifact-related Perceptual Score Attenuation Deep Learning Dialogue Enhancement Neural networks Parameter estimation Signal processing Speech enhancement System performance Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Speech enhancement attenuates interfering sounds in speech signals but may introduce artifacts that perceivably deteriorate the output signal. We propose a method for controlling the trade-off between the attenuation of the interfering background signal and the loss of sound quality. A deep neural network estimates the attenuation of the separated background signal such that the sound quality, quantified using the Artifact-related Perceptual Score, meets an adjustable target. Subjective evaluations indicate that consistent sound quality is obtained across various input signals. Our experiments show that the proposed method is able to control the tradeoff with an accuracy that is adequate for real-world dialogue enhancement applications.
ISSN:	2379-190X
DOI:	10.1109/ICASSP40776.2020.9053789