Joint Decision of Anti-Spoofing and Automatic Speaker Verification by Multi-Task Learning With Contrastive Loss
Automatic speaker verification (ASV) is an emerging biometric verification technique with more and more applications. However, both verification accuracy and anti-spoofing should be considered carefully before putting ASV into practice, where anti-spoofing is also called replay detection in which vo...
Saved in:
Published in | IEEE access Vol. 8; pp. 7907 - 7915 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Automatic speaker verification (ASV) is an emerging biometric verification technique with more and more applications. However, both verification accuracy and anti-spoofing should be considered carefully before putting ASV into practice, where anti-spoofing is also called replay detection in which voice is recorded, stored and replayed to deceive ASV systems. Cascaded decision of anti-spoofing and ASV is a straightforward solution to tackle the two issues. In this paper, joint decision of anti-spoofing and ASV was investigated in a multi-task learning framework with contrastive loss in order to improve the cascaded decision approach. A modified triplet loss was firstly constructed to supervise deep neural networks to extract embedding vectors containing information of both speaker identity and spoofing. The embedding vectors were subsequently taken as input features by back-end classifiers towards speaker and spoofing classification. The experimental results on both ASVspoof 2017 and ASVspoof 2019 showed that the proposed joint decision approach with triplet loss outperformed the corresponding baselines, a recent work on joint decision with Gaussian back-end fusion and our previous joint decision approach with cross-entropy loss. |
---|---|
AbstractList | Automatic speaker verification (ASV) is an emerging biometric verification technique with more and more applications. However, both verification accuracy and anti-spoofing should be considered carefully before putting ASV into practice, where anti-spoofing is also called replay detection in which voice is recorded, stored and replayed to deceive ASV systems. Cascaded decision of anti-spoofing and ASV is a straightforward solution to tackle the two issues. In this paper, joint decision of anti-spoofing and ASV was investigated in a multi-task learning framework with contrastive loss in order to improve the cascaded decision approach. A modified triplet loss was firstly constructed to supervise deep neural networks to extract embedding vectors containing information of both speaker identity and spoofing. The embedding vectors were subsequently taken as input features by back-end classifiers towards speaker and spoofing classification. The experimental results on both ASVspoof 2017 and ASVspoof 2019 showed that the proposed joint decision approach with triplet loss outperformed the corresponding baselines, a recent work on joint decision with Gaussian back-end fusion and our previous joint decision approach with cross-entropy loss. |
Author | Li, Jiakang Wang, Yimin Sun, Meng Zhang, Xiongwei |
Author_xml | – sequence: 1 givenname: Jiakang orcidid: 0000-0001-7692-841X surname: Li fullname: Li, Jiakang organization: Laboratory of Intelligent Information Processing, Army Engineering University, Nanjing, China – sequence: 2 givenname: Meng orcidid: 0000-0002-7435-3752 surname: Sun fullname: Sun, Meng email: sunmengccjs@gmail.com organization: Laboratory of Intelligent Information Processing, Army Engineering University, Nanjing, China – sequence: 3 givenname: Xiongwei surname: Zhang fullname: Zhang, Xiongwei email: xwzhang9898@163.com organization: Laboratory of Intelligent Information Processing, Army Engineering University, Nanjing, China – sequence: 4 givenname: Yimin surname: Wang fullname: Wang, Yimin organization: Communications Engineering College, Army Engineering University, Nanjing, China |
BookMark | eNpNkUFv1DAQhSNUJErpL-jFEucstmPHnuMqlFK0iMMWOFq2My7ebu3FySL13-MlVcVcPHqa741H721zlnLCprlidMUYhQ_rYbjebleccrri0Asq9KvmnLMe2k52_dl__Zvmcpp2tJauklTnTf6SY5rJR_RxijmRHMg6zbHdHnIOMd0Tm0ayPs750c7Rk-0B7QMW8gNLDNFXrTLuiXw97it0Z6cHskFb0on8GedfZMhpLnaa4x8kmzxN75rXwe4nvHx-L5rvn67vhs_t5tvN7bDetF5QPbeuYxYggHe0B1DBCdZJDZxKcNzLHpgbHegAIwsqjFxoJSll2DsPQaLsLprbxXfMdmcOJT7a8mSyjeafkMu9saVetEcTUAHzKlDHgvChc5wjoObQKS1GCNXr_eJ1KPn3EafZ7PKxpPp9w4UUinHNVJ3qlilf6p0Fw8tWRs0pKLMEZU5BmeegKnW1UBERXwgNkrKedn8Bh_SQtQ |
CODEN | IAECCG |
CitedBy_id | crossref_primary_10_1109_ACCESS_2020_3000641 crossref_primary_10_1109_TIFS_2020_3039045 crossref_primary_10_1007_s11227_020_03535_0 crossref_primary_10_1109_TASLP_2024_3358056 crossref_primary_10_3390_app10186292 crossref_primary_10_1007_s00034_022_01974_z crossref_primary_10_1007_s10772_021_09795_2 crossref_primary_10_1016_j_eswa_2023_122866 crossref_primary_10_3390_a16020066 crossref_primary_10_1007_s11042_021_11235_x crossref_primary_10_1016_j_patrec_2021_06_014 crossref_primary_10_1109_TASLP_2023_3267610 |
Cites_doi | 10.21437/Interspeech.2017-456 10.21437/Interspeech.2017-1111 10.1109/TIFS.2015.2407362 10.21437/Interspeech.2017-1608 10.21437/Interspeech.2018-2289 10.1109/ICASSP.2014.6854363 10.1109/JSTSP.2017.2671435 10.1109/CVPR.2018.00451 10.1109/ICASSP.2018.8461375 10.1109/ICASSP.2017.7953187 10.21437/Interspeech.2019-1230 10.21437/Interspeech.2019-1746 10.1109/CVPR.2015.7298682 10.1109/SLT.2018.8639510 10.21437/Odyssey.2018-42 10.21437/Interspeech.2017-360 10.21437/Interspeech.2019-2249 10.1109/ISCCSP.2008.4537397 10.21437/Interspeech.2017-1362 10.1016/j.specom.2014.10.005 10.1147/sj.403.0614 10.21437/Interspeech.2017-620 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
DBID | 97E ESBDL RIA RIE AAYXX CITATION 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D DOA |
DOI | 10.1109/ACCESS.2020.2964048 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005-present IEEE Xplore Open Access Journals IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Xplore CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Engineered Materials Abstracts METADEX Technology Research Database Materials Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Directory of Open Access Journals |
DatabaseTitle | CrossRef Materials Research Database Engineered Materials Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace METADEX Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Materials Research Database |
Database_xml | – sequence: 1 dbid: DOA name: Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: RIE name: IEEE Xplore url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 2169-3536 |
EndPage | 7915 |
ExternalDocumentID | oai_doaj_org_article_fe791c7f0b1f4cf3b22e9e8293784d9f 10_1109_ACCESS_2020_2964048 8950160 |
Genre | orig-research |
GrantInformation_xml | – fundername: National Natural Science Foundation of China grantid: 61471394 funderid: 10.13039/501100001809 – fundername: Natural Science Foundation of Jiangsu Province grantid: BK20180080 funderid: 10.13039/501100004608 |
GroupedDBID | 0R~ 4.4 5VS 6IK 97E AAJGR ABVLG ACGFS ADBBV ALMA_UNASSIGNED_HOLDINGS BCNDV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD ESBDL GROUPED_DOAJ IFIPE IPLJI JAVBF KQ8 M43 M~E O9- OCL OK1 RIA RIE RIG RNS AAYXX CITATION 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c408t-b31a99f9cb06997fb4135892059b2c5691bdb98f9d1f7fd24875001e6bc9f5e53 |
IEDL.DBID | DOA |
ISSN | 2169-3536 |
IngestDate | Tue Oct 22 15:12:54 EDT 2024 Thu Oct 10 18:08:01 EDT 2024 Fri Aug 23 03:24:22 EDT 2024 Wed Jun 26 19:26:52 EDT 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c408t-b31a99f9cb06997fb4135892059b2c5691bdb98f9d1f7fd24875001e6bc9f5e53 |
ORCID | 0000-0002-7435-3752 0000-0001-7692-841X |
OpenAccessLink | https://doaj.org/article/fe791c7f0b1f4cf3b22e9e8293784d9f |
PQID | 2454712817 |
PQPubID | 4845423 |
PageCount | 9 |
ParticipantIDs | crossref_primary_10_1109_ACCESS_2020_2964048 ieee_primary_8950160 doaj_primary_oai_doaj_org_article_fe791c7f0b1f4cf3b22e9e8293784d9f proquest_journals_2454712817 |
PublicationCentury | 2000 |
PublicationDate | 20200000 2020-00-00 20200101 2020-01-01 |
PublicationDateYYYYMMDD | 2020-01-01 |
PublicationDate_xml | – year: 2020 text: 20200000 |
PublicationDecade | 2020 |
PublicationPlace | Piscataway |
PublicationPlace_xml | – name: Piscataway |
PublicationTitle | IEEE access |
PublicationTitleAbbrev | Access |
PublicationYear | 2020 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref13 ref12 ref15 ref14 ref11 ref10 ref2 ref1 ref17 ref19 ref18 alegre (ref5) 2014 ref24 ref23 ref26 peddinti (ref25) 2015 ref20 ref21 li (ref16) 2019 lee (ref22) 2015 ref8 ref7 ref9 ref4 ref3 ref6 |
References_xml | – ident: ref11 doi: 10.21437/Interspeech.2017-456 – year: 2019 ident: ref16 article-title: Multi-task learning of deep neural networks for joint automatic speaker verification and spoofing detection publication-title: Proc APSIPA Annu Summit Conf contributor: fullname: li – ident: ref7 doi: 10.21437/Interspeech.2017-1111 – ident: ref2 doi: 10.1109/TIFS.2015.2407362 – start-page: 1 year: 2014 ident: ref5 article-title: Re-assessing the threat of replay spoofing attacks against automatic speaker verification publication-title: Proc IEEE Int Conf Biometrics Special Interest Group (BIOSIG) contributor: fullname: alegre – ident: ref18 doi: 10.21437/Interspeech.2017-1608 – start-page: 6 year: 2015 ident: ref25 article-title: A time delay neural network architecture for efficient modeling of long temporal contexts publication-title: Proc Annu Conf Int Speech Commun Assoc contributor: fullname: peddinti – ident: ref17 doi: 10.21437/Interspeech.2018-2289 – ident: ref24 doi: 10.1109/ICASSP.2014.6854363 – ident: ref6 doi: 10.1109/JSTSP.2017.2671435 – ident: ref20 doi: 10.1109/CVPR.2018.00451 – ident: ref26 doi: 10.1109/ICASSP.2018.8461375 – ident: ref23 doi: 10.1109/ICASSP.2017.7953187 – ident: ref12 doi: 10.21437/Interspeech.2019-1230 – ident: ref14 doi: 10.21437/Interspeech.2019-1746 – ident: ref19 doi: 10.1109/CVPR.2015.7298682 – ident: ref13 doi: 10.1109/SLT.2018.8639510 – ident: ref21 doi: 10.21437/Odyssey.2018-42 – ident: ref9 doi: 10.21437/Interspeech.2017-360 – ident: ref8 doi: 10.21437/Interspeech.2019-2249 – ident: ref4 doi: 10.1109/ISCCSP.2008.4537397 – ident: ref10 doi: 10.21437/Interspeech.2017-1362 – ident: ref3 doi: 10.1016/j.specom.2014.10.005 – ident: ref1 doi: 10.1147/sj.403.0614 – ident: ref15 doi: 10.21437/Interspeech.2017-620 – start-page: 2996 year: 2015 ident: ref22 article-title: The reddots data collection for speaker recognition publication-title: Proc Annu Conf Int Speech Commun Assoc contributor: fullname: lee |
SSID | ssj0000816957 |
Score | 2.4216063 |
Snippet | Automatic speaker verification (ASV) is an emerging biometric verification technique with more and more applications. However, both verification accuracy and... |
SourceID | doaj proquest crossref ieee |
SourceType | Open Website Aggregation Database Publisher |
StartPage | 7907 |
SubjectTerms | Anti-spoofing Artificial neural networks Biometrics (access control) Data mining Deep learning Embedding Feature extraction Machine learning multi-task learning Neural networks Propagation losses replay detection speaker verification Spoofing Task analysis triplet loss Verification Voice recognition |
SummonAdditionalLinks | – databaseName: IEEE Xplore dbid: RIE link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwEB61PcGBV0EsFOQDx2brOHbsOS4LVVVRLm2hNyt27LJaKVntZg_w67Ed76oCDtysKHYmnrHnYc83AB9EZbjkoi2EUW3BZWhhU1dFFXxnY1Xj6pRfcfW1vrjll3fi7gBO97kwzrl0-cxNYzOd5be93cZQ2ZlCEQHRDuFQUTbmau3jKbGABAqZgYVKimez-Tz8Q3ABGZ3Gw0Uaa_w8UD4Joz8XVflrJ07q5fwpXO0IG2-VLKfbwUztrz8wG_-X8mfwJNuZZDYKxnM4cN0LePwAffAY-st-0Q3kU66yQ3pPZt2wKK5XfZC47p40XUtm26FPqK7keuWapVuTb6G7z5E-Yn6SlMJb3DSbJclgrffk-2L4QSLw1brZxP2UfAnT8RJuzz_fzC-KXH-hsJyqoTBV2SB6tIbWiNKboPCEQhYsMsOsqLE0rUHlsS299C2Lvk_Qeq42Fr1wonoFR13fuddArKCVZVYK19a8UW0cVTgMX7BoWcUncLpjjF6NMBs6uScU9chHHfmoMx8n8DEyb_9qxMhOD8Kk67zktHcSSys9NaXn1leGMYdOBftGKh4omMBxZNR-kMyjCZzsREHn9bzRLOKexUNH-ebfvd7Co0jgGJw5gaNhvXXvgrkymPdJTn8DlxPoDA priority: 102 providerName: IEEE |
Title | Joint Decision of Anti-Spoofing and Automatic Speaker Verification by Multi-Task Learning With Contrastive Loss |
URI | https://ieeexplore.ieee.org/document/8950160 https://www.proquest.com/docview/2454712817 https://doaj.org/article/fe791c7f0b1f4cf3b22e9e8293784d9f |
Volume | 8 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV05T8MwFLYQEwyIU5RLHhgJJI4d-42lgBACFs7Nii-okJKqTQf-PbZjUCUGFrYoSnx8z3mHnfc9hI5ZqSinzGRMCZNR7q-grsqs9LGz0qK2VcyvuLuvrp_ozSt7XSj1Ff4J6-mBe-DOnOVQaO5yVTiqXakIsWCFt1JcUAMuat-CLQRTUQeLogLGE81QkcPZcDTyM_IBIclPw1FjHir-LJiiyNifSqz80svR2Fyto7XkJeJhP7oNtGSbTbS6wB24hdqbdtx0-CLVyMGtw8OmG2cPk9avl-YN143Bw3nXRk5W_DCx9Yed4mf_ukv7dFh94piAmz3Wsw-cqFbf8Mu4e8eBtmpaz4I2xLd--Nvo6erycXSdpeoJmaa56DJVFjWAA63yCoA75c0VE0C8P6WIZhUUyigQDkzhuDMkRC7eZtlKaXDMsnIHLTdtY3cR1iwvNdGcWVPRWpjQKrPge9CgSUkH6OQbSDnpSTJkDC5ykD3uMuAuE-4DdB7A_nk0MFzHG17uMsld_iX3AdoKovppRAALZHkDdPAtOpm-xpkkgbUsHBnyvf_oeh-thOn0GzEHaLmbzu2hd006dRRX4VHMIvwC_B3gSQ |
link.rule.ids | 315,786,790,802,870,2115,4043,27954,27955,27956,55107 |
linkProvider | Directory of Open Access Journals |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lc9MwEN4p5QAceBWGQAEdONapLUuW9xgCnVCSXppCbxpLlkomM3YmcQ7tr0eSlUwHOHDTeCx5rV1pH9J-C_CJ54oJxuuEq7JOmHAtrIo8yZ3vrHRZmSLkV8wuiskVO7_m1wdwss-FMcaEy2dm6JvhLL9u9daHyk5L5B4Q7QE8dHo-FX221j6i4ktIIBcRWihL8XQ0Hru_cE4gTYf-eDH1VX7uqZ-A0h_Lqvy1FwcFc_YMZjvS-nsly-G2U0N99wdq4__S_hyeRkuTjHrReAEHpnkJT-7hDx5Be94umo58iXV2SGvJqOkWyeWqdTLX3JCqqclo27UB15Vcrky1NGvyw3W3MdZH1C0JSbzJvNosSYRrvSE_F90v4qGv1tXG76hk6qbjFVydfZ2PJ0mswJBolpZdovKsQrSoVVogCqucyuMlUmeTKap5gZmqFZYW68wKW1Pv_Ti9Zwql0XLD89dw2LSNeQNE8zTXVAtu6oJVZe1H5QbdFzRqmrMBnOwYI1c90IYMDkqKsuej9HyUkY8D-OyZt3_Vo2SHB27SZVx00hqBmRY2VZll2uaKUoOmdBaOKJmjYABHnlH7QSKPBnC8EwUZV_RGUo985o8dxdt_9_oIjybz2VROv118fwePPbF9qOYYDrv11rx3xkunPgSZ_Q3zqutg |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Joint+Decision+of+Anti-Spoofing+and+Automatic+Speaker+Verification+by+Multi-Task+Learning+With+Contrastive+Loss&rft.jtitle=IEEE+access&rft.au=Li%2C+Jiakang&rft.au=Sun%2C+Meng&rft.au=Zhang%2C+Xiongwei&rft.au=Wang%2C+Yimin&rft.date=2020&rft.pub=IEEE&rft.eissn=2169-3536&rft.volume=8&rft.spage=7907&rft.epage=7915&rft_id=info:doi/10.1109%2FACCESS.2020.2964048&rft.externalDocID=8950160 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon |