A Supervised Approach to Global Signal-to-Noise Ratio Estimation for Whispered and Pathological Voices

The presence of background noise in signals adversely affects the performance of many speech-based algorithms. Accurate estimation of signal-to-noise-ratio (SNR), as a measure of noise level in a signal, can help in compensating for noise effects. Most existing SNR estimation methods have been devel...

Full description

Saved in:

Bibliographic Details
Published in	2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 296 - 300
Main Authors	Poorjam, Amir Hossein, Little, Max A., Jensen, Jesper Rindom, Christensen, Mads Graesboll
Format	Conference Proceeding
Language	English
Published	IEEE 01.04.2018
Subjects	Estimation Global SNR estimation Mel frequency cepstral coefficient MFCC Noise level Noise measurement pathological voice Pathology Signal to noise ratio support vector regression whispered speech
Online Access	Get full text

Cover

Loading…

Abstract	The presence of background noise in signals adversely affects the performance of many speech-based algorithms. Accurate estimation of signal-to-noise-ratio (SNR), as a measure of noise level in a signal, can help in compensating for noise effects. Most existing SNR estimation methods have been developed for normal speech and might not provide accurate estimation for special speech types such as whispered or disordered voices, particularly, when they are corrupted by non-stationary noises. In this paper, we first investigate the impact of stationary and non-stationary noise on the behavior of mel-frequency cepstral coefficients (MFCCs) extracted from normal, whispered and pathological voices. We demonstrate that, regardless of the speech type, the mean and the covariance of MFCCs are predictably modified by additive noise and the amount of change is related to the noise level. Then, we propose a new supervised method for SNR estimation which is based on a regression model trained on MFCCs of the noisy signals. Experimental results show that the proposed approach provides accurate estimation and consistent performance for various speech types under different noise conditions.
AbstractList	The presence of background noise in signals adversely affects the performance of many speech-based algorithms. Accurate estimation of signal-to-noise-ratio (SNR), as a measure of noise level in a signal, can help in compensating for noise effects. Most existing SNR estimation methods have been developed for normal speech and might not provide accurate estimation for special speech types such as whispered or disordered voices, particularly, when they are corrupted by non-stationary noises. In this paper, we first investigate the impact of stationary and non-stationary noise on the behavior of mel-frequency cepstral coefficients (MFCCs) extracted from normal, whispered and pathological voices. We demonstrate that, regardless of the speech type, the mean and the covariance of MFCCs are predictably modified by additive noise and the amount of change is related to the noise level. Then, we propose a new supervised method for SNR estimation which is based on a regression model trained on MFCCs of the noisy signals. Experimental results show that the proposed approach provides accurate estimation and consistent performance for various speech types under different noise conditions.
Author	Jensen, Jesper Rindom Poorjam, Amir Hossein Little, Max A. Christensen, Mads Graesboll
Author_xml	– sequence: 1 givenname: Amir Hossein surname: Poorjam fullname: Poorjam, Amir Hossein organization: Audio Analysis Lab, CREATE, Aalborg University, Aalborg, DK – sequence: 2 givenname: Max A. surname: Little fullname: Little, Max A. organization: Engineering and Applied Science, Aston University, Birmingham, UK – sequence: 3 givenname: Jesper Rindom surname: Jensen fullname: Jensen, Jesper Rindom organization: Audio Analysis Lab, CREATE, Aalborg University, Aalborg, DK – sequence: 4 givenname: Mads Graesboll surname: Christensen fullname: Christensen, Mads Graesboll organization: Audio Analysis Lab, CREATE, Aalborg University, Aalborg, DK
BookMark	eNotkMtOwzAURA0CiVLyBd34B1L8fiyjqrRIFVSE165yHLsxCnEUByT-niC6mlncOZo71-Cii50DYIHREmOkb-9XRVnulwRhtVRMEMb1Gci0VJhTJZjgSp2DGaFS51ij9yuQpfSBECJCMcnFDPgCll-9G75DcjUs-n6IxjZwjHDTxsq0sAzHzrT5GPOHON3AJzOGCNdpDJ9_roM-DvCtCWmCTATT1XBvxia28RjslH-Nwbp0Ay69aZPLTjoHL3fr59U23z1uph92eSAEjblnzFnkvSW4UoxorTCW2iqCkTCSceFsZSR1FatJpSknnAvEhPLSYisppXOw-OcG59yhH6aSw8_htAz9BRZ5WYU
ContentType	Conference Proceeding
DBID	6IE 6IH CBEJK RIE RIO
DOI	10.1109/ICASSP.2018.8462459
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEL IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Xplore Digital Library url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISBN	9781538646588 1538646587
EISSN	2379-190X
EndPage	300
ExternalDocumentID	8462459
Genre	orig-research
GroupedDBID	23M 29P 6IE 6IF 6IH 6IK 6IL 6IM 6IN AAJGR ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP IPLJI JC5 M43 OCL RIE RIL RIO RNS
ID	FETCH-LOGICAL-i220t-f44ec0ffc21b8429981179c82106a7456ecba73eb4d2b93525560468f7c1c7333
IEDL.DBID	RIE
IngestDate	Wed Jun 26 19:28:30 EDT 2024
IsDoiOpenAccess	false
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i220t-f44ec0ffc21b8429981179c82106a7456ecba73eb4d2b93525560468f7c1c7333
OpenAccessLink	https://vbn.aau.dk/files/289318688/ICASSP2018_amir_SNR.pdf
PageCount	5
ParticipantIDs	ieee_primary_8462459
PublicationCentury	2000
PublicationDate	2018-April
PublicationDateYYYYMMDD	2018-04-01
PublicationDate_xml	– month: 04 year: 2018 text: 2018-April
PublicationDecade	2010
PublicationTitle	2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
PublicationTitleAbbrev	ICASSP
PublicationYear	2018
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0002684756 ssj0008748
Score	2.1599848
Snippet	The presence of background noise in signals adversely affects the performance of many speech-based algorithms. Accurate estimation of signal-to-noise-ratio...
SourceID	ieee
SourceType	Publisher
StartPage	296
SubjectTerms	Estimation Global SNR estimation Mel frequency cepstral coefficient MFCC Noise level Noise measurement pathological voice Pathology Signal to noise ratio support vector regression whispered speech
Title	A Supervised Approach to Global Signal-to-Noise Ratio Estimation for Whispered and Pathological Voices
URI	https://ieeexplore.ieee.org/document/8462459
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjZ3LS8MwHMfDtpNefGzimxw8mm5L0iY9lrExhY1hne428qoOoR3aXvzrTdpaH3jwVkIIIQm_R_r7fgLAla-1TZKpRI5dgqiiAoUcM2Q9jS8IwcYvddyzeTBd0tuVv2qB60YLY4wpi8-M5z7Lf_k6U4W7KutbX4mpH7ZBm4VhpdVq7lMctaQkmddWmDPKa8rQcBD2b0ZRHC9cKRf36mF-vKdSupPJHph9TqSqInnxilx66v0Xo_G_M90HvS_hHlw0LukAtEx6CHa_MQe7IIlgXGydhXgzGkY1UhzmGazw_zDePNnDhfIMzTPbB965vYNjawoqlSO0YS58fN44xLgdQaQaLkTeGFH4kDnb0wPLyfh-NEX1Ywtog_EgRwmlRg2SROGh5M5JOQVqqLhNCQPBbJhllBSMGEk1lqGDqNpYiQY8YWqoGCHkCHTSLDXHAHIqBAkk0T4NqO3ABZVKS25zSSZwQk5A163YelvxNNb1Yp3-3XwGdtyuVdUy56CTvxbmwgYCubwsT8AHAn-w3A
link.rule.ids	310,311,783,787,792,793,799,27939,55088
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELZKGYCFR4t444ERh9Z2EmesqlYttFVFWuhW-RWokJIKkoVfj52E8BADm2VZlmVb393Z930HwJWrlAmSqUBWuwRRSTkKGPaRsTQuJwRrN-dxjyfeYE5vF-6iBq4rLozWOk8-045t5n_5KpGZfSq7MbYSUzfYAJuu9SsKtlb1omJ1S3It8xKHmU9ZqTPUbgU3w24nDKc2mYs55UQ_KqrkBqW_C8afSynySF6cLBWOfP-l0vjfte6B5hd1D04ro7QPajo-ADvfVAcbIOrAMFtbjHjTCnZKUXGYJrAoAADD1ZO5XihN0CQxY-C9PT3YM2BQ8ByhcXTh4_PKioybGXis4JSnFYzCh8SiTxPM-71Zd4DKcgtohXErRRGlWraiSOK2YNZMWQ5qIJkJCj3uG0dLS8F9ogVVWARWRtV4S9RjkS_b0ieEHIJ6nMT6CEBGOSeeIMqlHjUDGKdCKsFMNOlzHJFj0LA7tlwXihrLcrNO_u6-BFuD2Xi0HA0nd6dg255gkTtzBurpa6bPjVuQiov8NnwAc6-0KQ
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2018+IEEE+International+Conference+on+Acoustics%2C+Speech+and+Signal+Processing+%28ICASSP%29&rft.atitle=A+Supervised+Approach+to+Global+Signal-to-Noise+Ratio+Estimation+for+Whispered+and+Pathological+Voices&rft.au=Poorjam%2C+Amir+Hossein&rft.au=Little%2C+Max+A.&rft.au=Jensen%2C+Jesper+Rindom&rft.au=Christensen%2C+Mads+Graesboll&rft.date=2018-04-01&rft.pub=IEEE&rft.eissn=2379-190X&rft.spage=296&rft.epage=300&rft_id=info:doi/10.1109%2FICASSP.2018.8462459&rft.externalDocID=8462459