UNISOUND System for VoxCeleb Speaker Recognition Challenge 2023

This report describes the UNISOUND submission for Track1 and Track2 of VoxCeleb Speaker Recognition Challenge 2023 (VoxSRC 2023). We submit the same system on Track 1 and Track 2, which is trained with only VoxCeleb2-dev. Large-scale ResNet and RepVGG architectures are developed for the challenge. W...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Zheng, Yu, Zhang, Yajun, Niu, Chuanying, Zhan, Yibin, Long, Yanhua, Xu, Dongxing
Format	Paper
Language	English
Published	Ithaca Cornell University Library, arXiv.org 24.08.2023
Subjects	Consistency Speech recognition
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This report describes the UNISOUND submission for Track1 and Track2 of VoxCeleb Speaker Recognition Challenge 2023 (VoxSRC 2023). We submit the same system on Track 1 and Track 2, which is trained with only VoxCeleb2-dev. Large-scale ResNet and RepVGG architectures are developed for the challenge. We propose a consistency-aware score calibration method, which leverages the stability of audio voiceprints in similarity score by a Consistency Measure Factor (CMF). CMF brings a huge performance boost in this challenge. Our final system is a fusion of six models and achieves the first place in Track 1 and second place in Track 2 of VoxSRC 2023. The minDCF of our submission is 0.0855 and the EER is 1.5880%.
ISSN:	2331-8422