Hybrid DWT and MFCC feature warping for noisy forensic speaker verification in room reverberation
The robustness of speaker verification systems is often degraded in real forensic applications, which contain environmental noise and reverberation. Reverberation results in mismatched conditions between enrolment and test speech signals. In this work, we investigate the effectiveness of combining f...
Saved in:
Published in | 2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA) pp. 434 - 439 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.09.2017
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The robustness of speaker verification systems is often degraded in real forensic applications, which contain environmental noise and reverberation. Reverberation results in mismatched conditions between enrolment and test speech signals. In this work, we investigate the effectiveness of combining features of discrete wavelet transform (DWT) and feature-warped mel frequency cepstral coefficients (MFCCs) to improve the performance of speaker verification under conditions of reverberation and environmental noises. State of the art intermediate vector (i-vector) and probabilistic linear discriminant analysis (PLDA) were used as a classifier. The algorithm was evaluated by convolving the impulse room response with enrolment speech from an Australian forensic voice comparison database. The test speech signals were combined with car, street, and home noises from the QUT-NOISE database at signal to noise ratios (SNR) ranging from -10 dB to 10 dB. Experimental results indicate that the algorithm achieves a reduction in average equal error rate (EER) ranging from 17.10% to 51.86% over traditional MFCC features when reverberated enrolment data and the test speech signals are corrupted with car, street and home noises at SNRs ranging from -10 dB to 10 dB. |
---|---|
DOI: | 10.1109/ICSIPA.2017.8120650 |