NeuralMultiling: A Novel Neural Architecture Search for Smartphone based Multilingual Speaker Verification

Multilingual speaker verification introduces the challenge of verifying a speaker in multiple languages. Existing systems were built using i-vector/x-vector approaches along with Bi-LSTMs, which were trained to discriminate speakers, irrespective of the language. Instead of exploring the design spac...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Aravinda Reddy PN, Ramachandra, Raghavendra, K Sreenivasa Rao, Mitra, Pabitra
Format	Paper
Language	English
Published	Ithaca Cornell University Library, arXiv.org 08.08.2024
Subjects	Algorithms Audio visual equipment Error reduction Languages Verification
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Multilingual speaker verification introduces the challenge of verifying a speaker in multiple languages. Existing systems were built using i-vector/x-vector approaches along with Bi-LSTMs, which were trained to discriminate speakers, irrespective of the language. Instead of exploring the design space manually, we propose a neural architecture search for multilingual speaker verification suitable for mobile devices, called \textbf{NeuralMultiling}. First, our algorithm searches for an optimal operational combination of neural cells with different architectures for normal cells and reduction cells and then derives a CNN model by stacking neural cells. Using the derived architecture, we performed two different studies:1) language agnostic condition and 2) interoperability between languages and devices on the publicly available Multilingual Audio-Visual Smartphone (MAVS) dataset. The experimental results suggest that the derived architecture significantly outperforms the existing Autospeech method by a 5-6\% reduction in the Equal Error Rate (EER) with fewer model parameters.
ISSN:	2331-8422