Diffusion maps for PLDA-based speaker verification

During the last few years, i-vectors have become an important component in most state-of-the-art speaker recognition systems. I-vector extraction is based on an assumption that GMM supervectors reside on a low dimensional space, which is modeled using Factor Analysis. In this paper we replace the ab...

Full description

Saved in:

Bibliographic Details
Published in	2013 IEEE International Conference on Acoustics, Speech and Signal Processing pp. 7639 - 7643
Main Authors	Barkan, Oren, Aronowitz, Hagai
Format	Conference Proceeding
Language	English
Published	IEEE 01.05.2013
Subjects	Diffusion Maps Diffusion processes Feature extraction Harmonic analysis ivectors Manifolds NIST Non-linear dimensionality reduction Pattern recognition Speaker recognition Speaker verification Vectors
Online Access	Get full text

Cover

Loading…

More Information
Summary:	During the last few years, i-vectors have become an important component in most state-of-the-art speaker recognition systems. I-vector extraction is based on an assumption that GMM supervectors reside on a low dimensional space, which is modeled using Factor Analysis. In this paper we replace the above assumption with an assumption that the GMM supervectors reside on a low dimensional manifold and propose to use Diffusion Maps to learn that manifold. The learnt manifold implies a mapping of spoken sessions into a modified i-vector space which we call d-vector space. D-vectors can further be processed using standard techniques such as LDA, WCCN, cosine distance scoring or Probabilistic Linear Discriminant Analysis (PLDA). We demonstrate the usefulness of our approach on the telephone core conditions of NIST 2010, and obtain significant error reduction.
ISSN:	1520-6149 2379-190X
DOI:	10.1109/ICASSP.2013.6639149