Development of statistical estimators for speech enhancement using multi-objective grey wolf optimizer

Statistical Estimation using the SNR uncertainty technique is one of the effective Speech Enhancement (SE) algorithms. In this method, the Gain function plays a crucial role and it depends on the proper selection of the smoothing and threshold constants. In the literature, the values of these consta...

Full description

Saved in:

Bibliographic Details
Published in	Evolutionary intelligence Vol. 14; no. 2; pp. 767 - 778
Main Authors	Dash, Tusar Kanti, Solanki, Sandeep Singh, Panda, Ganapati, Satapathy, Suresh Chandra
Format	Journal Article
Language	English
Published	Berlin/Heidelberg Springer Berlin Heidelberg 01.06.2021 Springer Nature B.V
Subjects	Algorithms Applications of Mathematics Artificial Intelligence Bioinformatics Constants Control Engineering Intelligibility Mathematical and Computational Engineering Maximization Mechatronics Multiple objective analysis Noise Noise levels Noise threshold Optimization Robotics Smoothing Special Issue Speech Speech processing Statistical analysis Statistical Physics and Dynamical Systems Fuzzy logic Statistical estimators Quality Intelligibility Speech enhancement Bio-inspired techniques MOGWO
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Statistical Estimation using the SNR uncertainty technique is one of the effective Speech Enhancement (SE) algorithms. In this method, the Gain function plays a crucial role and it depends on the proper selection of the smoothing and threshold constants. In the literature, the values of these constants have been optimized by considering a single objective function of maximization of speech quality for a specific noise condition. But in practice, the noise magnitude varies and one set of optimized parameters cannot always provide consistent performance. In this paper, this problem has been addressed and solved in three steps. The first step is multi-objective optimization to find the best set of values of smoothing and threshold constants at different noise levels by considering the objectives of maximization of speech quality, intelligibility, and minimization of mean square error. The second step is the classification of the noisy speech into four SNR levels such as 0 dB, 5 dB, 10 dB, and 15 dB by using appropriate audio features. The values obtained in steps one and two are stored and in the third step, when the unknown noisy speech signal is to be enhanced the best-chosen values of the smoothing and threshold constants are selected for this task. Finally, the performance of the proposed method is evaluated in two different speech datasets. Then, comparative performance and statistical analysis are carried out using six other standard SE algorithms and it is demonstrated that the proposed approach provides superior performance than others.
ISSN:	1864-5909 1864-5917
DOI:	10.1007/s12065-020-00446-0