An evaluation study of modulation-domain wavelet denoising method by alleviating different sub-band portions for speech enhancement
In this study, we investigate and extend the capability of the method of modulation-domain wavelet denoising (ModWD) in speech enhancement primarily analyzing the unequal importance of different sub-band signals. The recently developed ModWD is shown to improve the speech quality in adverse noise en...
Saved in:
Published in | 2019 IEEE International Conference on Consumer Electronics - Taiwan (ICCE-TW) pp. 1 - 2 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.05.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In this study, we investigate and extend the capability of the method of modulation-domain wavelet denoising (ModWD) in speech enhancement primarily analyzing the unequal importance of different sub-band signals. The recently developed ModWD is shown to improve the speech quality in adverse noise environment by processing the magnitude spectrogram of a noisy speech signal with a one-level discrete wavelet transform (DWT) and then alleviating the obtained detailed portion, which is shown more vulnerable to noise. This study follows the idea of ModWD and use a wavelet packet decomposition (WPD) to decompose the magnitude spectral time series into four sub-band sequences at first. Then any of these four subband sequences is zeroed out while the other three ones are kept unchanged. Finally, these four sub-band sequences are used to construct the updated spectrogram. The main purpose of the aforementioned procedure is to evaluate the noise-robust capability of the magnitude series at different sub-bands which possess twice (modulation) frequency resolution compared with those used in ModWD. The presented method is conducted on a subset of the Aurora-2 connected digit database, and the speech quality evaluation results in terms of Perceptual Evaluation of Speech Quality (PESQ) scores reveal that diminishing the second highest frequency band (roughly within the range [25 Hz, 37.5 Hz]) gives rise to the optimal performance. |
---|---|
DOI: | 10.1109/ICCE-TW46550.2019.8991839 |