An Algorithm for Predicting the Intelligibility of Speech Masked by Modulated Noise Maskers

Intelligibility listening tests are necessary during development and evaluation of speech processing algorithms, despite the fact that they are expensive and time consuming. In this paper, we propose a monaural intelligibility prediction algorithm, which has the potential of replacing some of these...

Full description

Saved in:
Bibliographic Details
Published inIEEE/ACM transactions on audio, speech, and language processing Vol. 24; no. 11; pp. 2009 - 2022
Main Authors Jensen, Jesper, Taal, Cees H.
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 01.11.2016
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Intelligibility listening tests are necessary during development and evaluation of speech processing algorithms, despite the fact that they are expensive and time consuming. In this paper, we propose a monaural intelligibility prediction algorithm, which has the potential of replacing some of these listening tests. The proposed algorithm shows similarities to the short-time objective intelligibility (STOI) algorithm, but works for a larger range of input signals. In contrast to STOI, extended STOI (ESTOI) does not assume mutual independence between frequency bands. ESTOI also incorporates spectral correlation by comparing complete 400ms length spectrograms of the noisy/processed speech and the clean speech signals. As a consequence, ESTOI is also able to accurately predict the intelligibility of speech contaminated by temporally highly modulated noise sources in addition to noisy signals processed with time-frequency weighting. We show that ESTOI can be interpreted in terms of an orthogonal decomposition of short-time spectrograms into intelligibility subspaces, i.e., a ranking of spectrogram features according to their importance to intelligibility. A free MATLAB implementation of the algorithm is available for noncommercial use at http://kom.aau.dk/~jje/.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2329-9290
2329-9304
DOI:10.1109/TASLP.2016.2585878