Latent discriminative representation learning for speaker recognition

Extracting discriminative speaker-specific representations from speech signals and transforming them into fixed length vectors are key steps in speaker identification and verification systems. In this study, we propose a latent discriminative representation learning method for speaker recognition. W...

Full description

Saved in:
Bibliographic Details
Published inFrontiers of information technology & electronic engineering Vol. 22; no. 5; pp. 697 - 708
Main Authors Huang, Duolin, Mao, Qirong, Ma, Zhongchen, Zheng, Zhishen, Routryar, Sidheswar, Ocquaye, Elias-Nii-Noi
Format Journal Article
LanguageEnglish
Published Hangzhou Zhejiang University Press 01.05.2021
Springer Nature B.V
Jiangsu Key Laboratory of Security Technology for Industrial Cyberspace,Zhenjiang 212013,China
School of Computer Science and Communication Engineering,Jiangsu University,Zhenjiang 212013,China%School of Computer Science and Communication Engineering,Jiangsu University,Zhenjiang 212013,China
Subjects
Online AccessGet full text
ISSN2095-9184
2095-9230
DOI10.1631/FITEE.1900690

Cover

More Information
Summary:Extracting discriminative speaker-specific representations from speech signals and transforming them into fixed length vectors are key steps in speaker identification and verification systems. In this study, we propose a latent discriminative representation learning method for speaker recognition. We mean that the learned representations in this study are not only discriminative but also relevant. Specifically, we introduce an additional speaker embedded lookup table to explore the relevance between different utterances from the same speaker. Moreover, a reconstruction constraint intended to learn a linear mapping matrix is introduced to make representation discriminative. Experimental results demonstrate that the proposed method outperforms state-of-the-art methods based on the Apollo dataset used in the Fearless Steps Challenge in INTERSPEECH2019 and the TIMIT dataset.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2095-9184
2095-9230
DOI:10.1631/FITEE.1900690