A Supervised Approach to Global Signal-to-Noise Ratio Estimation for Whispered and Pathological Voices

The presence of background noise in signals adversely affects the performance of many speech-based algorithms. Accurate estimation of signal-to-noise-ratio (SNR), as a measure of noise level in a signal, can help in compensating for noise effects. Most existing SNR estimation methods have been devel...

Full description

Saved in:
Bibliographic Details
Published in2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 296 - 300
Main Authors Poorjam, Amir Hossein, Little, Max A., Jensen, Jesper Rindom, Christensen, Mads Graesboll
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.04.2018
Subjects
Online AccessGet full text

Cover

Loading…
Abstract The presence of background noise in signals adversely affects the performance of many speech-based algorithms. Accurate estimation of signal-to-noise-ratio (SNR), as a measure of noise level in a signal, can help in compensating for noise effects. Most existing SNR estimation methods have been developed for normal speech and might not provide accurate estimation for special speech types such as whispered or disordered voices, particularly, when they are corrupted by non-stationary noises. In this paper, we first investigate the impact of stationary and non-stationary noise on the behavior of mel-frequency cepstral coefficients (MFCCs) extracted from normal, whispered and pathological voices. We demonstrate that, regardless of the speech type, the mean and the covariance of MFCCs are predictably modified by additive noise and the amount of change is related to the noise level. Then, we propose a new supervised method for SNR estimation which is based on a regression model trained on MFCCs of the noisy signals. Experimental results show that the proposed approach provides accurate estimation and consistent performance for various speech types under different noise conditions.
AbstractList The presence of background noise in signals adversely affects the performance of many speech-based algorithms. Accurate estimation of signal-to-noise-ratio (SNR), as a measure of noise level in a signal, can help in compensating for noise effects. Most existing SNR estimation methods have been developed for normal speech and might not provide accurate estimation for special speech types such as whispered or disordered voices, particularly, when they are corrupted by non-stationary noises. In this paper, we first investigate the impact of stationary and non-stationary noise on the behavior of mel-frequency cepstral coefficients (MFCCs) extracted from normal, whispered and pathological voices. We demonstrate that, regardless of the speech type, the mean and the covariance of MFCCs are predictably modified by additive noise and the amount of change is related to the noise level. Then, we propose a new supervised method for SNR estimation which is based on a regression model trained on MFCCs of the noisy signals. Experimental results show that the proposed approach provides accurate estimation and consistent performance for various speech types under different noise conditions.
Author Jensen, Jesper Rindom
Poorjam, Amir Hossein
Little, Max A.
Christensen, Mads Graesboll
Author_xml – sequence: 1
  givenname: Amir Hossein
  surname: Poorjam
  fullname: Poorjam, Amir Hossein
  organization: Audio Analysis Lab, CREATE, Aalborg University, Aalborg, DK
– sequence: 2
  givenname: Max A.
  surname: Little
  fullname: Little, Max A.
  organization: Engineering and Applied Science, Aston University, Birmingham, UK
– sequence: 3
  givenname: Jesper Rindom
  surname: Jensen
  fullname: Jensen, Jesper Rindom
  organization: Audio Analysis Lab, CREATE, Aalborg University, Aalborg, DK
– sequence: 4
  givenname: Mads Graesboll
  surname: Christensen
  fullname: Christensen, Mads Graesboll
  organization: Audio Analysis Lab, CREATE, Aalborg University, Aalborg, DK
BookMark eNotkMtOwzAURA0CiVLyBd34B1L8fiyjqrRIFVSE165yHLsxCnEUByT-niC6mlncOZo71-Cii50DYIHREmOkb-9XRVnulwRhtVRMEMb1Gci0VJhTJZjgSp2DGaFS51ij9yuQpfSBECJCMcnFDPgCll-9G75DcjUs-n6IxjZwjHDTxsq0sAzHzrT5GPOHON3AJzOGCNdpDJ9_roM-DvCtCWmCTATT1XBvxia28RjslH-Nwbp0Ay69aZPLTjoHL3fr59U23z1uph92eSAEjblnzFnkvSW4UoxorTCW2iqCkTCSceFsZSR1FatJpSknnAvEhPLSYisppXOw-OcG59yhH6aSw8_htAz9BRZ5WYU
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ICASSP.2018.8462459
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEL
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore Digital Library
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 9781538646588
1538646587
EISSN 2379-190X
EndPage 300
ExternalDocumentID 8462459
Genre orig-research
GroupedDBID 23M
29P
6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
ABLEC
ACGFS
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
IPLJI
JC5
M43
OCL
RIE
RIL
RIO
RNS
ID FETCH-LOGICAL-i220t-f44ec0ffc21b8429981179c82106a7456ecba73eb4d2b93525560468f7c1c7333
IEDL.DBID RIE
IngestDate Wed Jun 26 19:28:30 EDT 2024
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i220t-f44ec0ffc21b8429981179c82106a7456ecba73eb4d2b93525560468f7c1c7333
OpenAccessLink https://vbn.aau.dk/files/289318688/ICASSP2018_amir_SNR.pdf
PageCount 5
ParticipantIDs ieee_primary_8462459
PublicationCentury 2000
PublicationDate 2018-April
PublicationDateYYYYMMDD 2018-04-01
PublicationDate_xml – month: 04
  year: 2018
  text: 2018-April
PublicationDecade 2010
PublicationTitle 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
PublicationTitleAbbrev ICASSP
PublicationYear 2018
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0002684756
ssj0008748
Score 2.1599848
Snippet The presence of background noise in signals adversely affects the performance of many speech-based algorithms. Accurate estimation of signal-to-noise-ratio...
SourceID ieee
SourceType Publisher
StartPage 296
SubjectTerms Estimation
Global SNR estimation
Mel frequency cepstral coefficient
MFCC
Noise level
Noise measurement
pathological voice
Pathology
Signal to noise ratio
support vector regression
whispered speech
Title A Supervised Approach to Global Signal-to-Noise Ratio Estimation for Whispered and Pathological Voices
URI https://ieeexplore.ieee.org/document/8462459
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjZ3LS8MwHMfDtpNefGzimxw8mm5L0iY9lrExhY1hne428qoOoR3aXvzrTdpaH3jwVkIIIQm_R_r7fgLAla-1TZKpRI5dgqiiAoUcM2Q9jS8IwcYvddyzeTBd0tuVv2qB60YLY4wpi8-M5z7Lf_k6U4W7KutbX4mpH7ZBm4VhpdVq7lMctaQkmddWmDPKa8rQcBD2b0ZRHC9cKRf36mF-vKdSupPJHph9TqSqInnxilx66v0Xo_G_M90HvS_hHlw0LukAtEx6CHa_MQe7IIlgXGydhXgzGkY1UhzmGazw_zDePNnDhfIMzTPbB965vYNjawoqlSO0YS58fN44xLgdQaQaLkTeGFH4kDnb0wPLyfh-NEX1Ywtog_EgRwmlRg2SROGh5M5JOQVqqLhNCQPBbJhllBSMGEk1lqGDqNpYiQY8YWqoGCHkCHTSLDXHAHIqBAkk0T4NqO3ABZVKS25zSSZwQk5A163YelvxNNb1Yp3-3XwGdtyuVdUy56CTvxbmwgYCubwsT8AHAn-w3A
link.rule.ids 310,311,783,787,792,793,799,27939,55088
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELZKGYCFR4t444ERh9Z2EmesqlYttFVFWuhW-RWokJIKkoVfj52E8BADm2VZlmVb393Z930HwJWrlAmSqUBWuwRRSTkKGPaRsTQuJwRrN-dxjyfeYE5vF-6iBq4rLozWOk8-045t5n_5KpGZfSq7MbYSUzfYAJuu9SsKtlb1omJ1S3It8xKHmU9ZqTPUbgU3w24nDKc2mYs55UQ_KqrkBqW_C8afSynySF6cLBWOfP-l0vjfte6B5hd1D04ro7QPajo-ADvfVAcbIOrAMFtbjHjTCnZKUXGYJrAoAADD1ZO5XihN0CQxY-C9PT3YM2BQ8ByhcXTh4_PKioybGXis4JSnFYzCh8SiTxPM-71Zd4DKcgtohXErRRGlWraiSOK2YNZMWQ5qIJkJCj3uG0dLS8F9ogVVWARWRtV4S9RjkS_b0ieEHIJ6nMT6CEBGOSeeIMqlHjUDGKdCKsFMNOlzHJFj0LA7tlwXihrLcrNO_u6-BFuD2Xi0HA0nd6dg255gkTtzBurpa6bPjVuQiov8NnwAc6-0KQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2018+IEEE+International+Conference+on+Acoustics%2C+Speech+and+Signal+Processing+%28ICASSP%29&rft.atitle=A+Supervised+Approach+to+Global+Signal-to-Noise+Ratio+Estimation+for+Whispered+and+Pathological+Voices&rft.au=Poorjam%2C+Amir+Hossein&rft.au=Little%2C+Max+A.&rft.au=Jensen%2C+Jesper+Rindom&rft.au=Christensen%2C+Mads+Graesboll&rft.date=2018-04-01&rft.pub=IEEE&rft.eissn=2379-190X&rft.spage=296&rft.epage=300&rft_id=info:doi/10.1109%2FICASSP.2018.8462459&rft.externalDocID=8462459