A Supervised Approach to Global Signal-to-Noise Ratio Estimation for Whispered and Pathological Voices
The presence of background noise in signals adversely affects the performance of many speech-based algorithms. Accurate estimation of signal-to-noise-ratio (SNR), as a measure of noise level in a signal, can help in compensating for noise effects. Most existing SNR estimation methods have been devel...
Saved in:
Published in | 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 296 - 300 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.04.2018
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | The presence of background noise in signals adversely affects the performance of many speech-based algorithms. Accurate estimation of signal-to-noise-ratio (SNR), as a measure of noise level in a signal, can help in compensating for noise effects. Most existing SNR estimation methods have been developed for normal speech and might not provide accurate estimation for special speech types such as whispered or disordered voices, particularly, when they are corrupted by non-stationary noises. In this paper, we first investigate the impact of stationary and non-stationary noise on the behavior of mel-frequency cepstral coefficients (MFCCs) extracted from normal, whispered and pathological voices. We demonstrate that, regardless of the speech type, the mean and the covariance of MFCCs are predictably modified by additive noise and the amount of change is related to the noise level. Then, we propose a new supervised method for SNR estimation which is based on a regression model trained on MFCCs of the noisy signals. Experimental results show that the proposed approach provides accurate estimation and consistent performance for various speech types under different noise conditions. |
---|---|
AbstractList | The presence of background noise in signals adversely affects the performance of many speech-based algorithms. Accurate estimation of signal-to-noise-ratio (SNR), as a measure of noise level in a signal, can help in compensating for noise effects. Most existing SNR estimation methods have been developed for normal speech and might not provide accurate estimation for special speech types such as whispered or disordered voices, particularly, when they are corrupted by non-stationary noises. In this paper, we first investigate the impact of stationary and non-stationary noise on the behavior of mel-frequency cepstral coefficients (MFCCs) extracted from normal, whispered and pathological voices. We demonstrate that, regardless of the speech type, the mean and the covariance of MFCCs are predictably modified by additive noise and the amount of change is related to the noise level. Then, we propose a new supervised method for SNR estimation which is based on a regression model trained on MFCCs of the noisy signals. Experimental results show that the proposed approach provides accurate estimation and consistent performance for various speech types under different noise conditions. |
Author | Jensen, Jesper Rindom Poorjam, Amir Hossein Little, Max A. Christensen, Mads Graesboll |
Author_xml | – sequence: 1 givenname: Amir Hossein surname: Poorjam fullname: Poorjam, Amir Hossein organization: Audio Analysis Lab, CREATE, Aalborg University, Aalborg, DK – sequence: 2 givenname: Max A. surname: Little fullname: Little, Max A. organization: Engineering and Applied Science, Aston University, Birmingham, UK – sequence: 3 givenname: Jesper Rindom surname: Jensen fullname: Jensen, Jesper Rindom organization: Audio Analysis Lab, CREATE, Aalborg University, Aalborg, DK – sequence: 4 givenname: Mads Graesboll surname: Christensen fullname: Christensen, Mads Graesboll organization: Audio Analysis Lab, CREATE, Aalborg University, Aalborg, DK |
BookMark | eNotkMtOwzAURA0CiVLyBd34B1L8fiyjqrRIFVSE165yHLsxCnEUByT-niC6mlncOZo71-Cii50DYIHREmOkb-9XRVnulwRhtVRMEMb1Gci0VJhTJZjgSp2DGaFS51ij9yuQpfSBECJCMcnFDPgCll-9G75DcjUs-n6IxjZwjHDTxsq0sAzHzrT5GPOHON3AJzOGCNdpDJ9_roM-DvCtCWmCTATT1XBvxia28RjslH-Nwbp0Ay69aZPLTjoHL3fr59U23z1uph92eSAEjblnzFnkvSW4UoxorTCW2iqCkTCSceFsZSR1FatJpSknnAvEhPLSYisppXOw-OcG59yhH6aSw8_htAz9BRZ5WYU |
ContentType | Conference Proceeding |
DBID | 6IE 6IH CBEJK RIE RIO |
DOI | 10.1109/ICASSP.2018.8462459 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEL IEEE Proceedings Order Plans (POP) 1998-present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore Digital Library url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISBN | 9781538646588 1538646587 |
EISSN | 2379-190X |
EndPage | 300 |
ExternalDocumentID | 8462459 |
Genre | orig-research |
GroupedDBID | 23M 29P 6IE 6IF 6IH 6IK 6IL 6IM 6IN AAJGR ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP IPLJI JC5 M43 OCL RIE RIL RIO RNS |
ID | FETCH-LOGICAL-i220t-f44ec0ffc21b8429981179c82106a7456ecba73eb4d2b93525560468f7c1c7333 |
IEDL.DBID | RIE |
IngestDate | Wed Jun 26 19:28:30 EDT 2024 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i220t-f44ec0ffc21b8429981179c82106a7456ecba73eb4d2b93525560468f7c1c7333 |
OpenAccessLink | https://vbn.aau.dk/files/289318688/ICASSP2018_amir_SNR.pdf |
PageCount | 5 |
ParticipantIDs | ieee_primary_8462459 |
PublicationCentury | 2000 |
PublicationDate | 2018-April |
PublicationDateYYYYMMDD | 2018-04-01 |
PublicationDate_xml | – month: 04 year: 2018 text: 2018-April |
PublicationDecade | 2010 |
PublicationTitle | 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
PublicationTitleAbbrev | ICASSP |
PublicationYear | 2018 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0002684756 ssj0008748 |
Score | 2.1599848 |
Snippet | The presence of background noise in signals adversely affects the performance of many speech-based algorithms. Accurate estimation of signal-to-noise-ratio... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 296 |
SubjectTerms | Estimation Global SNR estimation Mel frequency cepstral coefficient MFCC Noise level Noise measurement pathological voice Pathology Signal to noise ratio support vector regression whispered speech |
Title | A Supervised Approach to Global Signal-to-Noise Ratio Estimation for Whispered and Pathological Voices |
URI | https://ieeexplore.ieee.org/document/8462459 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjZ3LS8MwHMfDtpNefGzimxw8mm5L0iY9lrExhY1hne428qoOoR3aXvzrTdpaH3jwVkIIIQm_R_r7fgLAla-1TZKpRI5dgqiiAoUcM2Q9jS8IwcYvddyzeTBd0tuVv2qB60YLY4wpi8-M5z7Lf_k6U4W7KutbX4mpH7ZBm4VhpdVq7lMctaQkmddWmDPKa8rQcBD2b0ZRHC9cKRf36mF-vKdSupPJHph9TqSqInnxilx66v0Xo_G_M90HvS_hHlw0LukAtEx6CHa_MQe7IIlgXGydhXgzGkY1UhzmGazw_zDePNnDhfIMzTPbB965vYNjawoqlSO0YS58fN44xLgdQaQaLkTeGFH4kDnb0wPLyfh-NEX1Ywtog_EgRwmlRg2SROGh5M5JOQVqqLhNCQPBbJhllBSMGEk1lqGDqNpYiQY8YWqoGCHkCHTSLDXHAHIqBAkk0T4NqO3ABZVKS25zSSZwQk5A163YelvxNNb1Yp3-3XwGdtyuVdUy56CTvxbmwgYCubwsT8AHAn-w3A |
link.rule.ids | 310,311,783,787,792,793,799,27939,55088 |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELZKGYCFR4t444ERh9Z2EmesqlYttFVFWuhW-RWokJIKkoVfj52E8BADm2VZlmVb393Z930HwJWrlAmSqUBWuwRRSTkKGPaRsTQuJwRrN-dxjyfeYE5vF-6iBq4rLozWOk8-045t5n_5KpGZfSq7MbYSUzfYAJuu9SsKtlb1omJ1S3It8xKHmU9ZqTPUbgU3w24nDKc2mYs55UQ_KqrkBqW_C8afSynySF6cLBWOfP-l0vjfte6B5hd1D04ro7QPajo-ADvfVAcbIOrAMFtbjHjTCnZKUXGYJrAoAADD1ZO5XihN0CQxY-C9PT3YM2BQ8ByhcXTh4_PKioybGXis4JSnFYzCh8SiTxPM-71Zd4DKcgtohXErRRGlWraiSOK2YNZMWQ5qIJkJCj3uG0dLS8F9ogVVWARWRtV4S9RjkS_b0ieEHIJ6nMT6CEBGOSeeIMqlHjUDGKdCKsFMNOlzHJFj0LA7tlwXihrLcrNO_u6-BFuD2Xi0HA0nd6dg255gkTtzBurpa6bPjVuQiov8NnwAc6-0KQ |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2018+IEEE+International+Conference+on+Acoustics%2C+Speech+and+Signal+Processing+%28ICASSP%29&rft.atitle=A+Supervised+Approach+to+Global+Signal-to-Noise+Ratio+Estimation+for+Whispered+and+Pathological+Voices&rft.au=Poorjam%2C+Amir+Hossein&rft.au=Little%2C+Max+A.&rft.au=Jensen%2C+Jesper+Rindom&rft.au=Christensen%2C+Mads+Graesboll&rft.date=2018-04-01&rft.pub=IEEE&rft.eissn=2379-190X&rft.spage=296&rft.epage=300&rft_id=info:doi/10.1109%2FICASSP.2018.8462459&rft.externalDocID=8462459 |