Joint Decision of Anti-Spoofing and Automatic Speaker Verification by Multi-Task Learning With Contrastive Loss

Automatic speaker verification (ASV) is an emerging biometric verification technique with more and more applications. However, both verification accuracy and anti-spoofing should be considered carefully before putting ASV into practice, where anti-spoofing is also called replay detection in which vo...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 8; pp. 7907 - 7915
Main Authors Li, Jiakang, Sun, Meng, Zhang, Xiongwei, Wang, Yimin
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Automatic speaker verification (ASV) is an emerging biometric verification technique with more and more applications. However, both verification accuracy and anti-spoofing should be considered carefully before putting ASV into practice, where anti-spoofing is also called replay detection in which voice is recorded, stored and replayed to deceive ASV systems. Cascaded decision of anti-spoofing and ASV is a straightforward solution to tackle the two issues. In this paper, joint decision of anti-spoofing and ASV was investigated in a multi-task learning framework with contrastive loss in order to improve the cascaded decision approach. A modified triplet loss was firstly constructed to supervise deep neural networks to extract embedding vectors containing information of both speaker identity and spoofing. The embedding vectors were subsequently taken as input features by back-end classifiers towards speaker and spoofing classification. The experimental results on both ASVspoof 2017 and ASVspoof 2019 showed that the proposed joint decision approach with triplet loss outperformed the corresponding baselines, a recent work on joint decision with Gaussian back-end fusion and our previous joint decision approach with cross-entropy loss.
AbstractList Automatic speaker verification (ASV) is an emerging biometric verification technique with more and more applications. However, both verification accuracy and anti-spoofing should be considered carefully before putting ASV into practice, where anti-spoofing is also called replay detection in which voice is recorded, stored and replayed to deceive ASV systems. Cascaded decision of anti-spoofing and ASV is a straightforward solution to tackle the two issues. In this paper, joint decision of anti-spoofing and ASV was investigated in a multi-task learning framework with contrastive loss in order to improve the cascaded decision approach. A modified triplet loss was firstly constructed to supervise deep neural networks to extract embedding vectors containing information of both speaker identity and spoofing. The embedding vectors were subsequently taken as input features by back-end classifiers towards speaker and spoofing classification. The experimental results on both ASVspoof 2017 and ASVspoof 2019 showed that the proposed joint decision approach with triplet loss outperformed the corresponding baselines, a recent work on joint decision with Gaussian back-end fusion and our previous joint decision approach with cross-entropy loss.
Author Li, Jiakang
Wang, Yimin
Sun, Meng
Zhang, Xiongwei
Author_xml – sequence: 1
  givenname: Jiakang
  orcidid: 0000-0001-7692-841X
  surname: Li
  fullname: Li, Jiakang
  organization: Laboratory of Intelligent Information Processing, Army Engineering University, Nanjing, China
– sequence: 2
  givenname: Meng
  orcidid: 0000-0002-7435-3752
  surname: Sun
  fullname: Sun, Meng
  email: sunmengccjs@gmail.com
  organization: Laboratory of Intelligent Information Processing, Army Engineering University, Nanjing, China
– sequence: 3
  givenname: Xiongwei
  surname: Zhang
  fullname: Zhang, Xiongwei
  email: xwzhang9898@163.com
  organization: Laboratory of Intelligent Information Processing, Army Engineering University, Nanjing, China
– sequence: 4
  givenname: Yimin
  surname: Wang
  fullname: Wang, Yimin
  organization: Communications Engineering College, Army Engineering University, Nanjing, China
BookMark eNpNkUFv1DAQhSNUJErpL-jFEucstmPHnuMqlFK0iMMWOFq2My7ebu3FySL13-MlVcVcPHqa741H721zlnLCprlidMUYhQ_rYbjebleccrri0Asq9KvmnLMe2k52_dl__Zvmcpp2tJauklTnTf6SY5rJR_RxijmRHMg6zbHdHnIOMd0Tm0ayPs750c7Rk-0B7QMW8gNLDNFXrTLuiXw97it0Z6cHskFb0on8GedfZMhpLnaa4x8kmzxN75rXwe4nvHx-L5rvn67vhs_t5tvN7bDetF5QPbeuYxYggHe0B1DBCdZJDZxKcNzLHpgbHegAIwsqjFxoJSll2DsPQaLsLprbxXfMdmcOJT7a8mSyjeafkMu9saVetEcTUAHzKlDHgvChc5wjoObQKS1GCNXr_eJ1KPn3EafZ7PKxpPp9w4UUinHNVJ3qlilf6p0Fw8tWRs0pKLMEZU5BmeegKnW1UBERXwgNkrKedn8Bh_SQtQ
CODEN IAECCG
CitedBy_id crossref_primary_10_1109_ACCESS_2020_3000641
crossref_primary_10_1109_TIFS_2020_3039045
crossref_primary_10_1007_s11227_020_03535_0
crossref_primary_10_1109_TASLP_2024_3358056
crossref_primary_10_3390_app10186292
crossref_primary_10_1007_s00034_022_01974_z
crossref_primary_10_1007_s10772_021_09795_2
crossref_primary_10_1016_j_eswa_2023_122866
crossref_primary_10_3390_a16020066
crossref_primary_10_1007_s11042_021_11235_x
crossref_primary_10_1016_j_patrec_2021_06_014
crossref_primary_10_1109_TASLP_2023_3267610
Cites_doi 10.21437/Interspeech.2017-456
10.21437/Interspeech.2017-1111
10.1109/TIFS.2015.2407362
10.21437/Interspeech.2017-1608
10.21437/Interspeech.2018-2289
10.1109/ICASSP.2014.6854363
10.1109/JSTSP.2017.2671435
10.1109/CVPR.2018.00451
10.1109/ICASSP.2018.8461375
10.1109/ICASSP.2017.7953187
10.21437/Interspeech.2019-1230
10.21437/Interspeech.2019-1746
10.1109/CVPR.2015.7298682
10.1109/SLT.2018.8639510
10.21437/Odyssey.2018-42
10.21437/Interspeech.2017-360
10.21437/Interspeech.2019-2249
10.1109/ISCCSP.2008.4537397
10.21437/Interspeech.2017-1362
10.1016/j.specom.2014.10.005
10.1147/sj.403.0614
10.21437/Interspeech.2017-620
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
DBID 97E
ESBDL
RIA
RIE
AAYXX
CITATION
7SC
7SP
7SR
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
DOA
DOI 10.1109/ACCESS.2020.2964048
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005-present
IEEE Xplore Open Access Journals
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Xplore
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Engineered Materials Abstracts
METADEX
Technology Research Database
Materials Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Directory of Open Access Journals
DatabaseTitle CrossRef
Materials Research Database
Engineered Materials Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
METADEX
Computer and Information Systems Abstracts Professional
DatabaseTitleList

Materials Research Database
Database_xml – sequence: 1
  dbid: DOA
  name: Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: RIE
  name: IEEE Xplore
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2169-3536
EndPage 7915
ExternalDocumentID oai_doaj_org_article_fe791c7f0b1f4cf3b22e9e8293784d9f
10_1109_ACCESS_2020_2964048
8950160
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  grantid: 61471394
  funderid: 10.13039/501100001809
– fundername: Natural Science Foundation of Jiangsu Province
  grantid: BK20180080
  funderid: 10.13039/501100004608
GroupedDBID 0R~
4.4
5VS
6IK
97E
AAJGR
ABVLG
ACGFS
ADBBV
ALMA_UNASSIGNED_HOLDINGS
BCNDV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
ESBDL
GROUPED_DOAJ
IFIPE
IPLJI
JAVBF
KQ8
M43
M~E
O9-
OCL
OK1
RIA
RIE
RIG
RNS
AAYXX
CITATION
7SC
7SP
7SR
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c408t-b31a99f9cb06997fb4135892059b2c5691bdb98f9d1f7fd24875001e6bc9f5e53
IEDL.DBID DOA
ISSN 2169-3536
IngestDate Tue Oct 22 15:12:54 EDT 2024
Thu Oct 10 18:08:01 EDT 2024
Fri Aug 23 03:24:22 EDT 2024
Wed Jun 26 19:26:52 EDT 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c408t-b31a99f9cb06997fb4135892059b2c5691bdb98f9d1f7fd24875001e6bc9f5e53
ORCID 0000-0002-7435-3752
0000-0001-7692-841X
OpenAccessLink https://doaj.org/article/fe791c7f0b1f4cf3b22e9e8293784d9f
PQID 2454712817
PQPubID 4845423
PageCount 9
ParticipantIDs crossref_primary_10_1109_ACCESS_2020_2964048
ieee_primary_8950160
doaj_primary_oai_doaj_org_article_fe791c7f0b1f4cf3b22e9e8293784d9f
proquest_journals_2454712817
PublicationCentury 2000
PublicationDate 20200000
2020-00-00
20200101
2020-01-01
PublicationDateYYYYMMDD 2020-01-01
PublicationDate_xml – year: 2020
  text: 20200000
PublicationDecade 2020
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE access
PublicationTitleAbbrev Access
PublicationYear 2020
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref12
ref15
ref14
ref11
ref10
ref2
ref1
ref17
ref19
ref18
alegre (ref5) 2014
ref24
ref23
ref26
peddinti (ref25) 2015
ref20
ref21
li (ref16) 2019
lee (ref22) 2015
ref8
ref7
ref9
ref4
ref3
ref6
References_xml – ident: ref11
  doi: 10.21437/Interspeech.2017-456
– year: 2019
  ident: ref16
  article-title: Multi-task learning of deep neural networks for joint automatic speaker verification and spoofing detection
  publication-title: Proc APSIPA Annu Summit Conf
  contributor:
    fullname: li
– ident: ref7
  doi: 10.21437/Interspeech.2017-1111
– ident: ref2
  doi: 10.1109/TIFS.2015.2407362
– start-page: 1
  year: 2014
  ident: ref5
  article-title: Re-assessing the threat of replay spoofing attacks against automatic speaker verification
  publication-title: Proc IEEE Int Conf Biometrics Special Interest Group (BIOSIG)
  contributor:
    fullname: alegre
– ident: ref18
  doi: 10.21437/Interspeech.2017-1608
– start-page: 6
  year: 2015
  ident: ref25
  article-title: A time delay neural network architecture for efficient modeling of long temporal contexts
  publication-title: Proc Annu Conf Int Speech Commun Assoc
  contributor:
    fullname: peddinti
– ident: ref17
  doi: 10.21437/Interspeech.2018-2289
– ident: ref24
  doi: 10.1109/ICASSP.2014.6854363
– ident: ref6
  doi: 10.1109/JSTSP.2017.2671435
– ident: ref20
  doi: 10.1109/CVPR.2018.00451
– ident: ref26
  doi: 10.1109/ICASSP.2018.8461375
– ident: ref23
  doi: 10.1109/ICASSP.2017.7953187
– ident: ref12
  doi: 10.21437/Interspeech.2019-1230
– ident: ref14
  doi: 10.21437/Interspeech.2019-1746
– ident: ref19
  doi: 10.1109/CVPR.2015.7298682
– ident: ref13
  doi: 10.1109/SLT.2018.8639510
– ident: ref21
  doi: 10.21437/Odyssey.2018-42
– ident: ref9
  doi: 10.21437/Interspeech.2017-360
– ident: ref8
  doi: 10.21437/Interspeech.2019-2249
– ident: ref4
  doi: 10.1109/ISCCSP.2008.4537397
– ident: ref10
  doi: 10.21437/Interspeech.2017-1362
– ident: ref3
  doi: 10.1016/j.specom.2014.10.005
– ident: ref1
  doi: 10.1147/sj.403.0614
– ident: ref15
  doi: 10.21437/Interspeech.2017-620
– start-page: 2996
  year: 2015
  ident: ref22
  article-title: The reddots data collection for speaker recognition
  publication-title: Proc Annu Conf Int Speech Commun Assoc
  contributor:
    fullname: lee
SSID ssj0000816957
Score 2.4216063
Snippet Automatic speaker verification (ASV) is an emerging biometric verification technique with more and more applications. However, both verification accuracy and...
SourceID doaj
proquest
crossref
ieee
SourceType Open Website
Aggregation Database
Publisher
StartPage 7907
SubjectTerms Anti-spoofing
Artificial neural networks
Biometrics (access control)
Data mining
Deep learning
Embedding
Feature extraction
Machine learning
multi-task learning
Neural networks
Propagation losses
replay detection
speaker verification
Spoofing
Task analysis
triplet loss
Verification
Voice recognition
SummonAdditionalLinks – databaseName: IEEE Xplore
  dbid: RIE
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwEB61PcGBV0EsFOQDx2brOHbsOS4LVVVRLm2hNyt27LJaKVntZg_w67Ed76oCDtysKHYmnrHnYc83AB9EZbjkoi2EUW3BZWhhU1dFFXxnY1Xj6pRfcfW1vrjll3fi7gBO97kwzrl0-cxNYzOd5be93cZQ2ZlCEQHRDuFQUTbmau3jKbGABAqZgYVKimez-Tz8Q3ABGZ3Gw0Uaa_w8UD4Joz8XVflrJ07q5fwpXO0IG2-VLKfbwUztrz8wG_-X8mfwJNuZZDYKxnM4cN0LePwAffAY-st-0Q3kU66yQ3pPZt2wKK5XfZC47p40XUtm26FPqK7keuWapVuTb6G7z5E-Yn6SlMJb3DSbJclgrffk-2L4QSLw1brZxP2UfAnT8RJuzz_fzC-KXH-hsJyqoTBV2SB6tIbWiNKboPCEQhYsMsOsqLE0rUHlsS299C2Lvk_Qeq42Fr1wonoFR13fuddArKCVZVYK19a8UW0cVTgMX7BoWcUncLpjjF6NMBs6uScU9chHHfmoMx8n8DEyb_9qxMhOD8Kk67zktHcSSys9NaXn1leGMYdOBftGKh4omMBxZNR-kMyjCZzsREHn9bzRLOKexUNH-ebfvd7Co0jgGJw5gaNhvXXvgrkymPdJTn8DlxPoDA
  priority: 102
  providerName: IEEE
Title Joint Decision of Anti-Spoofing and Automatic Speaker Verification by Multi-Task Learning With Contrastive Loss
URI https://ieeexplore.ieee.org/document/8950160
https://www.proquest.com/docview/2454712817
https://doaj.org/article/fe791c7f0b1f4cf3b22e9e8293784d9f
Volume 8
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV05T8MwFLYQEwyIU5RLHhgJJI4d-42lgBACFs7Nii-okJKqTQf-PbZjUCUGFrYoSnx8z3mHnfc9hI5ZqSinzGRMCZNR7q-grsqs9LGz0qK2VcyvuLuvrp_ozSt7XSj1Ff4J6-mBe-DOnOVQaO5yVTiqXakIsWCFt1JcUAMuat-CLQRTUQeLogLGE81QkcPZcDTyM_IBIclPw1FjHir-LJiiyNifSqz80svR2Fyto7XkJeJhP7oNtGSbTbS6wB24hdqbdtx0-CLVyMGtw8OmG2cPk9avl-YN143Bw3nXRk5W_DCx9Yed4mf_ukv7dFh94piAmz3Wsw-cqFbf8Mu4e8eBtmpaz4I2xLd--Nvo6erycXSdpeoJmaa56DJVFjWAA63yCoA75c0VE0C8P6WIZhUUyigQDkzhuDMkRC7eZtlKaXDMsnIHLTdtY3cR1iwvNdGcWVPRWpjQKrPge9CgSUkH6OQbSDnpSTJkDC5ykD3uMuAuE-4DdB7A_nk0MFzHG17uMsld_iX3AdoKovppRAALZHkDdPAtOpm-xpkkgbUsHBnyvf_oeh-thOn0GzEHaLmbzu2hd006dRRX4VHMIvwC_B3gSQ
link.rule.ids 315,786,790,802,870,2115,4043,27954,27955,27956,55107
linkProvider Directory of Open Access Journals
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lc9MwEN4p5QAceBWGQAEdONapLUuW9xgCnVCSXppCbxpLlkomM3YmcQ7tr0eSlUwHOHDTeCx5rV1pH9J-C_CJ54oJxuuEq7JOmHAtrIo8yZ3vrHRZmSLkV8wuiskVO7_m1wdwss-FMcaEy2dm6JvhLL9u9daHyk5L5B4Q7QE8dHo-FX221j6i4ktIIBcRWihL8XQ0Hru_cE4gTYf-eDH1VX7uqZ-A0h_Lqvy1FwcFc_YMZjvS-nsly-G2U0N99wdq4__S_hyeRkuTjHrReAEHpnkJT-7hDx5Be94umo58iXV2SGvJqOkWyeWqdTLX3JCqqclo27UB15Vcrky1NGvyw3W3MdZH1C0JSbzJvNosSYRrvSE_F90v4qGv1tXG76hk6qbjFVydfZ2PJ0mswJBolpZdovKsQrSoVVogCqucyuMlUmeTKap5gZmqFZYW68wKW1Pv_Ti9Zwql0XLD89dw2LSNeQNE8zTXVAtu6oJVZe1H5QbdFzRqmrMBnOwYI1c90IYMDkqKsuej9HyUkY8D-OyZt3_Vo2SHB27SZVx00hqBmRY2VZll2uaKUoOmdBaOKJmjYABHnlH7QSKPBnC8EwUZV_RGUo985o8dxdt_9_oIjybz2VROv118fwePPbF9qOYYDrv11rx3xkunPgSZ_Q3zqutg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Joint+Decision+of+Anti-Spoofing+and+Automatic+Speaker+Verification+by+Multi-Task+Learning+With+Contrastive+Loss&rft.jtitle=IEEE+access&rft.au=Li%2C+Jiakang&rft.au=Sun%2C+Meng&rft.au=Zhang%2C+Xiongwei&rft.au=Wang%2C+Yimin&rft.date=2020&rft.pub=IEEE&rft.eissn=2169-3536&rft.volume=8&rft.spage=7907&rft.epage=7915&rft_id=info:doi/10.1109%2FACCESS.2020.2964048&rft.externalDocID=8950160
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon