Speech Foundation Models on Intelligibility Prediction for Hearing-Impaired Listeners

Speech foundation models (SFMs) have been benchmarked on many speech processing tasks, often achieving state-of-the-art performance with minimal adaptation. However, the SFM paradigm has been significantly less explored for applications of interest to the speech perception community. In this paper w...

Full description

Saved in:

Bibliographic Details
Published in	ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 1421 - 1425
Main Authors	Cuervo, Santiago, Marxer, Ricard
Format	Conference Proceeding
Language	English
Published	IEEE 14.04.2024
Subjects	Acoustics Adaptation models Benchmark testing Foundation models hearing aids intelligibility prediction Magnetic heads Predictive models Recording speech perception Systematics
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Speech foundation models (SFMs) have been benchmarked on many speech processing tasks, often achieving state-of-the-art performance with minimal adaptation. However, the SFM paradigm has been significantly less explored for applications of interest to the speech perception community. In this paper we present a systematic evaluation of 10 SFMs on one such application: Speech intelligibility prediction. We focus on the non-intrusive setup of the Clarity Prediction Challenge 2 (CPC2), where the task is to predict the percentage of words correctly perceived by hearing-impaired listeners from speech-in-noise recordings. We propose a simple method that learns a lightweight specialized prediction head on top of frozen SFMs to approach the problem. Our results reveal statistically significant differences in performance across SFMs. Our method resulted in the winning submission in the CPC2, demonstrating its promise for speech perception applications.
ISSN:	2379-190X
DOI:	10.1109/ICASSP48485.2024.10447907