Latent source-specific generative factor learning for monaural speech separation using weighted-factor autoencoder
Much recent progress in monaural speech separation (MSS) has been achieved through a series of deep learning architectures based on autoencoders, which use an encoder to condense the input signal into compressed features and then feed these features into a decoder to construct a specific audio sourc...
Saved in:
Published in | Frontiers of information technology & electronic engineering Vol. 21; no. 11; pp. 1639 - 1650 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Hangzhou
Zhejiang University Press
01.11.2020
Springer Nature B.V School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China%School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China Jiangsu Key Laboratory of Security Technology for Industrial Cyberspace, Zhenjiang 212013, China |
Subjects | |
Online Access | Get full text |
ISSN | 2095-9184 2095-9230 |
DOI | 10.1631/FITEE.2000019 |
Cover
Abstract | Much recent progress in monaural speech separation (MSS) has been achieved through a series of deep learning architectures based on autoencoders, which use an encoder to condense the input signal into compressed features and then feed these features into a decoder to construct a specific audio source of interest. However, these approaches can neither learn generative factors of the original input for MSS nor construct each audio source in mixed speech. In this study, we propose a novel weighted-factor autoencoder (WFAE) model for MSS, which introduces a regularization loss in the objective function to isolate one source without containing other sources. By incorporating a latent attention mechanism and a supervised source constructor in the separation layer, WFAE can learn source-specific generative factors and a set of discriminative features for each source, leading to MSS performance improvement. Experiments on benchmark datasets show that our approach outperforms the existing methods. In terms of three important metrics, WFAE has great success on a relatively challenging MSS case, i.e., speaker-independent MSS. |
---|---|
AbstractList | Much recent progress in monaural speech separation (MSS) has been achieved through a series of deep learning architectures based on autoencoders, which use an encoder to condense the input signal into compressed features and then feed these features into a decoder to construct a specific audio source of interest. However, these approaches can neither learn generative factors of the original input for MSS nor construct each audio source in mixed speech. In this study, we propose a novel weighted-factor autoencoder (WFAE) model for MSS, which introduces a regularization loss in the objective function to isolate one source without containing other sources. By incorporating a latent attention mechanism and a supervised source constructor in the separation layer, WFAE can learn source-specific generative factors and a set of discriminative features for each source, leading to MSS performance improvement. Experiments on benchmark datasets show that our approach outperforms the existing methods. In terms of three important metrics, WFAE has great success on a relatively challenging MSS case, i.e., speaker-independent MSS. TN912.3; Much recent progress in monaural speech separation (MSS) has been achieved through a series of deep learning architectures based on autoencoders, which use an encoder to condense the input signal into compressed features and then feed these features into a decoder to construct a specific audio source of interest. However, these approaches can neither learn generative factors of the original input for MSS nor construct each audio source in mixed speech. In this study, we propose a novel weighted-factor autoencoder (WFAE) model for MSS, which introduces a regularization loss in the objective function to isolate one source without containing other sources. By incorporating a latent attention mechanism and a supervised source constructor in the separation layer, WFAE can learn source-specific generative factors and a set of discriminative features for each source, leading to MSS performance improvement. Experiments on benchmark datasets show that our approach outperforms the existing methods. In terms of three important metrics, WFAE has great success on a relatively challenging MSS case, i.e., speaker-independent MSS. |
Author | Mao, Qi-rong Chen, Jing-jing Qian, Shuang-qing Qin, You-cai Zheng, Zhi-shen |
AuthorAffiliation | School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China%School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China;Jiangsu Key Laboratory of Security Technology for Industrial Cyberspace, Zhenjiang 212013, China |
AuthorAffiliation_xml | – name: School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China%School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China;Jiangsu Key Laboratory of Security Technology for Industrial Cyberspace, Zhenjiang 212013, China |
Author_xml | – sequence: 1 givenname: Jing-jing orcidid: 0000-0003-2968-0313 surname: Chen fullname: Chen, Jing-jing organization: School of Computer Science and Communication Engineering, Jiangsu University – sequence: 2 givenname: Qi-rong orcidid: 0000-0002-0616-4431 surname: Mao fullname: Mao, Qi-rong email: mao_qr@ujs.edu.cn organization: School of Computer Science and Communication Engineering, Jiangsu University, Jiangsu Key Laboratory of Security Technology for Industrial Cyberspace – sequence: 3 givenname: You-cai surname: Qin fullname: Qin, You-cai organization: School of Computer Science and Communication Engineering, Jiangsu University – sequence: 4 givenname: Shuang-qing surname: Qian fullname: Qian, Shuang-qing organization: School of Computer Science and Communication Engineering, Jiangsu University – sequence: 5 givenname: Zhi-shen surname: Zheng fullname: Zheng, Zhi-shen organization: School of Computer Science and Communication Engineering, Jiangsu University |
BookMark | eNpt0c9PwjAUB_DGYCIiR-9LvJkM-4uxHg0BJSHxgufl0b2OEWix3QT96y2C4WIvbZPPey_t95Z0rLNIyD2jA5YJ9jSdLSaTAadxMXVFupyqYaq4oJ2_M8vlDemHsD6SjKmRyrvEz6FB2yTBtV5jGnaoa1PrpEKLHpr6ExMDunE-2SB4W9sqMfGydRZaD5skFqBeJQF3cOTOJm04oj3W1arBMj1XQ9s4tNqV6O_ItYFNwP5575H36WQxfk3nby-z8fM81VwNmxTlSMpSKK4yytmyRIralAjSyIybUgoBuVJC5VLRJRfDJYocUGc5sAwUoOiRx1PfPVgDtirW8Y02Tiy-1-XhsNQFcsopY5TmET-c8M67jxZDc9E8_tyISxaH9Uh6Utq7EDyaYufrLfivgtHiGEPxG0NxjiH6wcmH6GyF_tL1_4IfWrSM6A |
Cites_doi | 10.1016/S0893-6080(00)00026-5 10.1109/TASLP.2014.2352935 10.1109/9780470043387 10.1109/TASLP.2017.2726762 10.1007/s11704-016-6107-0 10.1007/s11771-019-4211-7 10.1109/TASL.2006.876726 10.1006/csla.1994.1016 10.7551/mitpress/1486.001.0001 10.1023/A:1007425814087 10.1109/TASL.2012.2215591 10.1109/TSA.2005.858005 10.6028/NIST.IR.4930 10.1109/TASLP.2019.2915167 10.1109/29.35387 10.1109/ACCESS.2018.2884027 10.1016/j.sigpro.2007.02.003 10.1631/FITEE.1700814 10.1109/ICASSP.2017.7951788 10.1109/ICASSP.2017.7952155 10.1109/APSIPA.2016.7820736 10.1109/ICASSP.2017.7952154 10.1109/ICASSP.2015.7178964 10.1109/ICASSP.2014.6853860 10.21437/Interspeech.2018-1140 10.21437/Interspeech.2017-349 10.1109/GlobalSIP.2017.8309164 10.1109/ICASSP.2016.7471631 10.1109/ICASSP40776.2020.9054266 10.1109/SIU.2019.8806536 10.21437/Interspeech.2006-655 10.1109/MLSP.2018.8516918 10.1109/ICASSP.2015.7178061 |
ClassificationCodes | TN912.3 |
ContentType | Journal Article |
Copyright | Zhejiang University and Springer-Verlag GmbH Germany, part of Springer Nature 2020 Zhejiang University and Springer-Verlag GmbH Germany, part of Springer Nature 2020. Copyright © Wanfang Data Co. Ltd. All Rights Reserved. |
Copyright_xml | – notice: Zhejiang University and Springer-Verlag GmbH Germany, part of Springer Nature 2020 – notice: Zhejiang University and Springer-Verlag GmbH Germany, part of Springer Nature 2020. – notice: Copyright © Wanfang Data Co. Ltd. All Rights Reserved. |
DBID | AAYXX CITATION 8FE 8FG ABJCF AFKRA ARAPS AZQEC BENPR BGLVJ CCPQU DWQXO GNUQQ HCIFZ JQ2 K7- L6V M7S P5Z P62 PHGZM PHGZT PKEHL PQEST PQGLB PQQKQ PQUKI PTHSS 2B. 4A8 92I 93N PSX TCJ |
DOI | 10.1631/FITEE.2000019 |
DatabaseName | CrossRef ProQuest SciTech Collection ProQuest Technology Collection Materials Science & Engineering Collection ProQuest Central UK/Ireland Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One Community College ProQuest Central Korea ProQuest Central Student SciTech Premium Collection ProQuest Computer Science Collection Computer Science Database ProQuest Engineering Collection Engineering Database Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic (New) ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition Engineering Collection Wanfang Data Journals - Hong Kong WANFANG Data Centre Wanfang Data Journals 万方数据期刊 - 香港版 China Online Journals (COJ) China Online Journals (COJ) |
DatabaseTitle | CrossRef Computer Science Database ProQuest Central Student Technology Collection ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection SciTech Premium Collection ProQuest One Community College ProQuest Central ProQuest One Applied & Life Sciences ProQuest Engineering Collection ProQuest Central Korea ProQuest Central (New) Engineering Collection Advanced Technologies & Aerospace Collection Engineering Database ProQuest One Academic Eastern Edition ProQuest Technology Collection ProQuest SciTech Collection Advanced Technologies & Aerospace Database ProQuest One Academic UKI Edition Materials Science & Engineering Collection ProQuest One Academic ProQuest One Academic (New) |
DatabaseTitleList | Computer Science Database |
Database_xml | – sequence: 1 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Computer Science |
EISSN | 2095-9230 |
EndPage | 1650 |
ExternalDocumentID | zjdxxbc_e202011008 10_1631_FITEE_2000019 |
GrantInformation_xml | – fundername: (Project supported by the Key Project of the National Natural Science Foundation of China); (the National Nat-ural Science Foundation of China); (the Qing Lan Talent Program of Jiangsu Province,China,and the Key Inno-vation Project of Undergraduate Students in Jiangsu Province,China) funderid: (Project supported by the Key Project of the National Natural Science Foundation of China); (the National Nat-ural Science Foundation of China); (the Qing Lan Talent Program of Jiangsu Province,China,and the Key Inno-vation Project of Undergraduate Students in Jiangsu Province,China) |
GroupedDBID | -EM -SI -S~ 0R~ 2KG 4.4 406 5VR 96X AACDK AAHNG AAIAL AAJBT AAJKR AANZL AARHV AARTL AASML AATNV AATVU AAUYE AAXDM AAYIU AAYTO AAYZH AAZMS ABAKF ABDZT ABECU ABFTD ABFTV ABJCF ABJNI ABJOX ABKCH ABMQK ABQBU ABSXP ABTEG ABTHY ABTKH ABTMW ABXPI ACAOD ACBXY ACDTI ACGFS ACHSB ACIWK ACKNC ACMDZ ACMLO ACOKC ACPIV ACZOJ ADINQ ADKNI ADKPE ADRFC ADURQ ADYFF ADZKW AEBTG AEFQL AEGNC AEJHL AEJRE AEMSY AENEX AEOHA AESKC AETCA AEVLU AEXYK AFBBN AFKRA AFLOW AFQWF AFUIB AFZKB AGAYW AGDGC AGJBK AGMZJ AGQEE AGQMX AGRTI AGWZB AGYKE AHAVH AHBYD AHKAY AHSBF AHYZX AIAKS AIGIU AILAN AITGF AJBLW AJRNO AJZVZ ALFXC ALMA_UNASSIGNED_HOLDINGS AMKLP AMXSW AMYLF ANMIH AOCGG ARAPS AXYYD BENPR BGLVJ BGNMA CAJEI CCEZO CCPQU CHBEP CUBFJ CW9 DDRTE DNIVK DPUIP EBLON EBS EIOEI EJD FA0 FERAY FIGPU FINBP FNLPD FRRFC FSGXE FYJPI GGCAI GGRSB HCIFZ IKXTQ IWAJR J-C JUIAU JZLTJ K7- KOV LLZTM M4Y M7S NPVJJ NQJWS NU0 O9J PT4 PTHSS Q-- R-I RLLFE ROL RSV S.. SJYHP SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE TCJ TGT TSG U1G U5S UG4 UOJIU UTJUX UZXMN VFIZW Z7R Z7X Z7Z Z83 Z88 ZMTXR AAPKM AAYXX ABBRH ABDBE ABFSG ACSTC AEZWR AFDZB AFHIU AFOHR AHPBZ AHWEU AIXLP ATHPR AYFIA CITATION PHGZM PHGZT 8FE 8FG ABRTQ AZQEC DWQXO GNUQQ JQ2 L6V P62 PKEHL PQEST PQGLB PQQKQ PQUKI 2B. 4A8 92I 93N PMFND PSX |
ID | FETCH-LOGICAL-c295t-e4744d39296021bde0ecfdea4f462fd433a899398490b235be38aec68a16a9ae3 |
IEDL.DBID | AGYKE |
ISSN | 2095-9184 |
IngestDate | Thu May 29 04:06:16 EDT 2025 Wed Aug 13 10:53:47 EDT 2025 Tue Jul 01 03:03:19 EDT 2025 Fri Feb 21 02:35:19 EST 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 11 |
Keywords | Deep learning Speech separation TN912.3 Generative factors Autoencoder |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c295t-e4744d39296021bde0ecfdea4f462fd433a899398490b235be38aec68a16a9ae3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0003-2968-0313 0000-0002-0616-4431 |
PQID | 2918724199 |
PQPubID | 2044401 |
PageCount | 12 |
ParticipantIDs | wanfang_journals_zjdxxbc_e202011008 proquest_journals_2918724199 crossref_primary_10_1631_FITEE_2000019 springer_journals_10_1631_FITEE_2000019 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2020-11-01 |
PublicationDateYYYYMMDD | 2020-11-01 |
PublicationDate_xml | – month: 11 year: 2020 text: 2020-11-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | Hangzhou |
PublicationPlace_xml | – name: Hangzhou – name: Heidelberg |
PublicationTitle | Frontiers of information technology & electronic engineering |
PublicationTitleAbbrev | Front Inform Technol Electron Eng |
PublicationTitle_FL | Frontiers of Information Technology & Electronic Engineering |
PublicationYear | 2020 |
Publisher | Zhejiang University Press Springer Nature B.V School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China%School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China Jiangsu Key Laboratory of Security Technology for Industrial Cyberspace, Zhenjiang 212013, China |
Publisher_xml | – name: Zhejiang University Press – name: Springer Nature B.V – name: School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China%School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China – name: Jiangsu Key Laboratory of Security Technology for Industrial Cyberspace, Zhenjiang 212013, China |
References | Hyvärinen, Oja (CR15) 2000; 13 Gou, Yi, Zhang (CR9) 2018; 6 Luo, Mesgarani (CR18) 2019; 27 CR19 CR16 CR14 CR36 CR12 CR34 CR11 CR10 CR32 Ghahramani, Jordan (CR8) 1997; 29 Xia, Wang, Guo (CR35) 2019; 26 Zhang, Zhang (CR37) 2018; 12 Vincent, Gribonval, Fevotte (CR30) 2006; 14 Benesty, Chen, Huang (CR2) 2008 Garofolo, Lamel, Fisher (CR7) 1993 CR6 CR5 Qian, Weng, Chang (CR24) 2018; 19 Araki, Sawada, Mukai (CR1) 2007; 87 CR27 Bregman (CR3) 1990 CR26 Smaragdis (CR28) 2007; 15 CR25 CR23 van der Maaten, Hinton (CR29) 2008; 9 CR22 Nadas, Nahamoo, Picheny (CR20) 1989; 37 CR21 Wang, Brown (CR31) 2006 Wang, Narayanan, Wang (CR33) 2014; 22 Hu, Wang (CR13) 2013; 21 Brown, Cooke (CR4) 1994; 8 Kolbæk, Yu, Tan (CR17) 2017; 25 J Benesty (1599_CR2) 2008 1599_CR21 AS Bregman (1599_CR3) 1990 1599_CR23 K Hu (1599_CR13) 2013; 21 1599_CR22 YX Wang (1599_CR33) 2014; 22 1599_CR25 1599_CR27 1599_CR26 M Kolbæk (1599_CR17) 2017; 25 JP Gou (1599_CR9) 2018; 6 P Smaragdis (1599_CR28) 2007; 15 E Vincent (1599_CR30) 2006; 14 YM Qian (1599_CR24) 2018; 19 DL Wang (1599_CR31) 2006 A Nadas (1599_CR20) 1989; 37 L van der Maaten (1599_CR29) 2008; 9 1599_CR19 Z Ghahramani (1599_CR8) 1997; 29 1599_CR10 1599_CR32 JS Garofolo (1599_CR7) 1993 A Hyvärinen (1599_CR15) 2000; 13 QJ Zhang (1599_CR37) 2018; 12 S Araki (1599_CR1) 2007; 87 1599_CR12 1599_CR34 1599_CR11 LM Xia (1599_CR35) 2019; 26 1599_CR14 Y Luo (1599_CR18) 2019; 27 1599_CR36 1599_CR16 GJ Brown (1599_CR4) 1994; 8 1599_CR6 1599_CR5 |
References_xml | – ident: CR22 – volume: 9 start-page: 2579 issue: 11 year: 2008 end-page: 2605 ident: CR29 article-title: Visualizing data using t-SNE publication-title: J Mach Learn Res – volume: 13 start-page: 411 issue: 4–5 year: 2000 end-page: 430 ident: CR15 article-title: Independent component analysis: algorithms and applications publication-title: Neur Netw doi: 10.1016/S0893-6080(00)00026-5 – ident: CR14 – ident: CR16 – ident: CR12 – year: 2008 ident: CR2 publication-title: Microphone Array Signal Processing – volume: 22 start-page: 1849 issue: 12 year: 2014 end-page: 1858 ident: CR33 article-title: On training targets for supervised speech separation publication-title: IEEE/ACM Trans Audio Speech Lang Process doi: 10.1109/TASLP.2014.2352935 – ident: CR10 – year: 2006 ident: CR31 publication-title: Computational Auditory Scene Analysis: Principles, Algorithms, and Applications doi: 10.1109/9780470043387 – volume: 25 start-page: 1901 issue: 10 year: 2017 end-page: 1913 ident: CR17 article-title: Multitalker speech separation with utterance-level permutation invariant training of deep recurrent neural networks publication-title: IEEE/ACM Trans Audio Speech Lang Process doi: 10.1109/TASLP.2017.2726762 – volume: 12 start-page: 1140 issue: 6 year: 2018 end-page: 1148 ident: CR37 article-title: Convolutional adaptive denoising autoencoders for hierarchical feature extraction publication-title: Front Comput Sci doi: 10.1007/s11704-016-6107-0 – volume: 26 start-page: 2759 issue: 10 year: 2019 end-page: 2770 ident: CR35 article-title: Gait recognition based on Wasserstein generating adversarial image inpainting network publication-title: J Cent South Univ doi: 10.1007/s11771-019-4211-7 – ident: CR6 – volume: 15 start-page: 1 issue: 1 year: 2007 end-page: 12 ident: CR28 article-title: Convolutive speech bases and their application to supervised speech separation publication-title: IEEE Trans Audio Speech Lang Process doi: 10.1109/TASL.2006.876726 – volume: 8 start-page: 297 issue: 4 year: 1994 end-page: 336 ident: CR4 article-title: Computational auditory scene analysis publication-title: Comput Speech Lang doi: 10.1006/csla.1994.1016 – ident: CR25 – ident: CR27 – year: 1990 ident: CR3 publication-title: Auditory Scene Analysis: the Perceptual Organization of Sound doi: 10.7551/mitpress/1486.001.0001 – volume: 29 start-page: 245 issue: 2–3 year: 1997 end-page: 273 ident: CR8 article-title: Factorial hidden Markov models publication-title: Mach Learn doi: 10.1023/A:1007425814087 – ident: CR23 – volume: 21 start-page: 122 issue: 1 year: 2013 end-page: 131 ident: CR13 article-title: An unsupervised approach to cochannel speech separation publication-title: IEEE Trans Audio Speech Lang Process doi: 10.1109/TASL.2012.2215591 – ident: CR21 – ident: CR19 – volume: 14 start-page: 1462 issue: 4 year: 2006 end-page: 1469 ident: CR30 article-title: Performance measurement in blind audio source separation publication-title: IEEE Trans Audio Speech Lang Process doi: 10.1109/TSA.2005.858005 – year: 1993 ident: CR7 publication-title: DARPA TIMIT Acoustic-Phonetic Continous Speech Corpus CD-ROM. NIST Speech Disc 1-1.1 doi: 10.6028/NIST.IR.4930 – volume: 27 start-page: 1256 issue: 8 year: 2019 end-page: 1266 ident: CR18 article-title: Conv-TasNet: surpassing ideal time-frequency magnitude masking for speech separation publication-title: IEEE/ACM Trans Audio Speech Lang Process doi: 10.1109/TASLP.2019.2915167 – volume: 37 start-page: 1495 issue: 10 year: 1989 end-page: 1503 ident: CR20 article-title: Speech recognition using noise-adaptive prototypes publication-title: IEEE Trans Acoust Speech Signal Process doi: 10.1109/29.35387 – ident: CR11 – volume: 6 start-page: 75748 year: 2018 end-page: 75766 ident: CR9 article-title: Sparsity and geometry preserving graph embedding for dimensionality reduction publication-title: IEEE Access doi: 10.1109/ACCESS.2018.2884027 – volume: 87 start-page: 1833 issue: 8 year: 2007 end-page: 1847 ident: CR1 article-title: Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors publication-title: Signal Process doi: 10.1016/j.sigpro.2007.02.003 – ident: CR32 – volume: 19 start-page: 40 issue: 1 year: 2018 end-page: 63 ident: CR24 article-title: Past review, current progress, and challenges ahead on the cocktail party problem publication-title: Front Inform Technol Electron Eng doi: 10.1631/FITEE.1700814 – ident: CR34 – ident: CR36 – ident: CR5 – ident: CR26 – volume: 6 start-page: 75748 year: 2018 ident: 1599_CR9 publication-title: IEEE Access doi: 10.1109/ACCESS.2018.2884027 – ident: 1599_CR21 doi: 10.1109/ICASSP.2017.7951788 – ident: 1599_CR25 – volume: 8 start-page: 297 issue: 4 year: 1994 ident: 1599_CR4 publication-title: Comput Speech Lang doi: 10.1006/csla.1994.1016 – volume: 12 start-page: 1140 issue: 6 year: 2018 ident: 1599_CR37 publication-title: Front Comput Sci doi: 10.1007/s11704-016-6107-0 – ident: 1599_CR5 doi: 10.1109/ICASSP.2017.7952155 – ident: 1599_CR32 doi: 10.1109/APSIPA.2016.7820736 – ident: 1599_CR36 doi: 10.1109/ICASSP.2017.7952154 – ident: 1599_CR22 doi: 10.1109/ICASSP.2015.7178964 – volume-title: Microphone Array Signal Processing year: 2008 ident: 1599_CR2 – volume: 27 start-page: 1256 issue: 8 year: 2019 ident: 1599_CR18 publication-title: IEEE/ACM Trans Audio Speech Lang Process doi: 10.1109/TASLP.2019.2915167 – volume: 26 start-page: 2759 issue: 10 year: 2019 ident: 1599_CR35 publication-title: J Cent South Univ doi: 10.1007/s11771-019-4211-7 – ident: 1599_CR14 doi: 10.1109/ICASSP.2014.6853860 – ident: 1599_CR23 doi: 10.21437/Interspeech.2018-1140 – volume: 14 start-page: 1462 issue: 4 year: 2006 ident: 1599_CR30 publication-title: IEEE Trans Audio Speech Lang Process doi: 10.1109/TSA.2005.858005 – volume: 29 start-page: 245 issue: 2–3 year: 1997 ident: 1599_CR8 publication-title: Mach Learn doi: 10.1023/A:1007425814087 – volume: 22 start-page: 1849 issue: 12 year: 2014 ident: 1599_CR33 publication-title: IEEE/ACM Trans Audio Speech Lang Process doi: 10.1109/TASLP.2014.2352935 – volume: 19 start-page: 40 issue: 1 year: 2018 ident: 1599_CR24 publication-title: Front Inform Technol Electron Eng doi: 10.1631/FITEE.1700814 – volume: 15 start-page: 1 issue: 1 year: 2007 ident: 1599_CR28 publication-title: IEEE Trans Audio Speech Lang Process doi: 10.1109/TASL.2006.876726 – ident: 1599_CR12 doi: 10.21437/Interspeech.2017-349 – volume: 25 start-page: 1901 issue: 10 year: 2017 ident: 1599_CR17 publication-title: IEEE/ACM Trans Audio Speech Lang Process doi: 10.1109/TASLP.2017.2726762 – ident: 1599_CR10 doi: 10.1109/GlobalSIP.2017.8309164 – ident: 1599_CR11 doi: 10.1109/ICASSP.2016.7471631 – volume: 21 start-page: 122 issue: 1 year: 2013 ident: 1599_CR13 publication-title: IEEE Trans Audio Speech Lang Process doi: 10.1109/TASL.2012.2215591 – volume: 9 start-page: 2579 issue: 11 year: 2008 ident: 1599_CR29 publication-title: J Mach Learn Res – ident: 1599_CR19 doi: 10.1109/ICASSP40776.2020.9054266 – volume: 37 start-page: 1495 issue: 10 year: 1989 ident: 1599_CR20 publication-title: IEEE Trans Acoust Speech Signal Process doi: 10.1109/29.35387 – ident: 1599_CR16 doi: 10.1109/SIU.2019.8806536 – volume: 87 start-page: 1833 issue: 8 year: 2007 ident: 1599_CR1 publication-title: Signal Process doi: 10.1016/j.sigpro.2007.02.003 – volume-title: DARPA TIMIT Acoustic-Phonetic Continous Speech Corpus CD-ROM. NIST Speech Disc 1-1.1 year: 1993 ident: 1599_CR7 doi: 10.6028/NIST.IR.4930 – ident: 1599_CR27 doi: 10.21437/Interspeech.2006-655 – volume-title: Computational Auditory Scene Analysis: Principles, Algorithms, and Applications year: 2006 ident: 1599_CR31 doi: 10.1109/9780470043387 – ident: 1599_CR34 doi: 10.1109/MLSP.2018.8516918 – ident: 1599_CR26 – volume-title: Auditory Scene Analysis: the Perceptual Organization of Sound year: 1990 ident: 1599_CR3 doi: 10.7551/mitpress/1486.001.0001 – volume: 13 start-page: 411 issue: 4–5 year: 2000 ident: 1599_CR15 publication-title: Neur Netw doi: 10.1016/S0893-6080(00)00026-5 – ident: 1599_CR6 doi: 10.1109/ICASSP.2015.7178061 |
SSID | ssj0001619798 |
Score | 2.1643238 |
Snippet | Much recent progress in monaural speech separation (MSS) has been achieved through a series of deep learning architectures based on autoencoders, which use an... TN912.3; Much recent progress in monaural speech separation (MSS) has been achieved through a series of deep learning architectures based on autoencoders,... |
SourceID | wanfang proquest crossref springer |
SourceType | Aggregation Database Index Database Publisher |
StartPage | 1639 |
SubjectTerms | Communications Engineering Computer Hardware Computer Science Computer Systems Organization and Communication Networks Deep learning Dictionaries Electrical Engineering Electronics and Microelectronics Instrumentation Microphones Networks Neural networks Regularization Separation Speech |
SummonAdditionalLinks | – databaseName: ProQuest Central dbid: BENPR link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LT-MwELZ4XOCw4im6W5Al0HKySB3HcQ4IAWqFEFQrtEjcIjueFHFISwnail_P2HE2vcA5tqPMOPOy5_sIOVFZCZh2RIwXMmJCCWAK3TSzIjHZoFSRsK5R-H4sbx7F7VPytELGbS-Mu1bZ2kRvqO20cDXyM54NVIruJssuZq_MsUa509WWQkMHagV77iHGVsk6mmSF-379ajj-89BVXTBfSD1BLo88SaESAXhTxoOzEVqLoW9e8cg7y46qiz7_H5j6Np-q1NVkySONtsiPEErSy0b322QFqh2yuQQwuEvmdxhKVjVtSvTMtVW6q0F04sGmnaWjDeEODewRE4pBLMVv1Q6Og-IEKJ7pGzQI4dOKunvyE_rPF1TBsjBbv9dTh4hpYb5HHkfDv9c3LLAssIJnSc1ApEJYFyZJ9PfGQgRFaUGLUkheWhHHGnOyOFOoUcPjxECsNBRS6YF00N7xPlmrphUcEJrIUlsFSqeRRdvATQrSSmVKabgs0qRHfrcizWcNmEbukhCUfe5lnwfZ90i_FXge_qm3vNsBPXLaKqF7_MVCx0FH3cCPF7tYmCIHHvnoJ1I_v3_dL7LhhjYdiH2yVs_f4RBDkdochf31CUNH3eE priority: 102 providerName: ProQuest |
Title | Latent source-specific generative factor learning for monaural speech separation using weighted-factor autoencoder |
URI | https://link.springer.com/article/10.1631/FITEE.2000019 https://www.proquest.com/docview/2918724199 https://d.wanfangdata.com.cn/periodical/zjdxxbc-e202011008 |
Volume | 21 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3fb9MwED6N7gUeGAwQg1FZAsGTt9RxHOdxRc0mflQTYtJ4iuz4UgRSirpUTPvrOTsOLSAkeMqDc1Zi39nf2XffAbzQRYPkdiRc1CrhUkvkmrZp7mRmi0mjE-l8ovD7uTq7kG8us8sdOB5yYUK0-3AlGVZqb9YqnRyXZMqzkFmSeJrP3WyiCz2C3ZPTT2-3jlXIIchDBVyRhCqEWkZmzT_6-HUn2sDLnzeiIY-nbUy72Npyyj04Hz62jzT5erTu7FF98xuP43_8zT24G-EnO-n15T7sYLsPe0NpBxYtfR_ubPEUPoDVO0Kkbcf6k37uszN9hBFbBM5qv2Cyvm4Pi0UoFoywMCMNN57Vg5EA1p_ZFfZE48uW-XD7BfsezmXR8Sht1t3SE2s6XD2Ei3L28fUZj8UaeC2KrOMocymdR1uKYIN1mGDdODSykUo0TqapIdcuLTQphhVpZjHVBmulzUR5hvD0EYzaZYuPgWWqMU6jNnniaIkRNkfllLaNskLVeXYAL4eJq771nByV92VoXKswrlUc1wM4HKa1iqZ5VQnSiZxwS0HNr4a52TT_paPnURM2L958cdfXtq5QJAFEJfrJP_f3FG57qT6n8RBG3WqNzwjcdHYMt3R5Oia1LqfT-TiqNz2ns_n5hx8QjPiX |
linkProvider | Springer Nature |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwEB6V7QE4IJ7qlgKWeJ2sZh3HcQ4V4rGrLd2uEGql3oITTxb1kC27qVr4cfw2xo5D9gK3nhMnyngyD9vf9wG80lmF1HZEXJQq4lJL5JrSNLcyKbJRpSNpHVD4eK6mp_LzWXK2Bb87LIw7VtnFRB-o7bJ0a-T7IhvplNJNlr27-MGdapTbXe0kNEyQVrAHnmIsADuO8OcVtXDrg8NPNN-vhZiMTz5OeVAZ4KXIkoajTKW0rkxQlO8KixGWlUUjK6lEZWUcG-pJ4kzTFxUiTgqMtcFSaTNSjto6pufegm3pEK4D2P4wnn_52q_yUH-SekFeEXlRRC0D0aeKR_sTik5jD5bxTD-bibGvdv9u0HpYUV2ZerGRASf34V4oXdn71tcewBbWD-HuBqHhI1jNqHStG9ZuCXAH43RHkdjCk1u7yMpagR8W1CoWjIpmRrY1jv6D0QAsv7M1tozky5q5c_kLduUXcNHyMNpcNkvHwGlx9RhOb8TeT2BQL2vcAZaoyliN2qSRpVgkihSVVbqoVCFUmSZDeNOZNL9oyTty1_SQ7XNv-zzYfgh7ncHz8A-v897jhvC2m4T-8j8e9DLMUX_jr3N7fV2UOYrIV1uR3v3_617A7enJ8SyfHc6PnsIdN6xFP-7BoFld4jMqg5riefA1Bt9u2r3_AD2FGyU |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1RTxQxEJ6QIyH6IIIaQZQmEH0q9Lrdbjc-EeVAQeKDJJiYbNrt9Iwke-TYi4Rfb9vteifExPi828luO22_aef7BmBXlQ592MEoryWjQgmkym_T1IrclEOnmLCBKPzpTB6fi48X-cUSvO25MDHbvb-S7DgNQaWpafevrItTXGbD_ZGf1oeRZcKC5OeyYB73D2D54OjrycIRiw8OilgNl7NYkVCJpLJ5z8afu9Icav6-HY2cnsbpZryw_YxW4Vv_4V3WyeXerDV79e0dTcf__LPH8CjBUnLQ-dEaLGGzDqt9yQeSVoB1eLigX_gEpqc6mCbdDQANrM2QeUTGUcs6LKSkq-dDUnGKMfEYmXjP10Htg_gGWH8n19gJkE8aEtLwx-RnPK9FS1NrPWsnQXDT4vQpnI8Ov7w7pqmIA615mbcURSGEDShMejhhLDKsnUUtnJDcWZFl2od8Wam8wxie5QYzpbGWSg9lUA7PnsGgmTT4HEgunbYKlS6Y9UsPNwVKK5Vx0nBZF_kGvO4HsbrqtDqqEOP4fq1iv1apXzdgqx_iKk3Z64p7_yg8nin94zf9OM0f_8XQTvKK-Yu3P-zNjakr5CyCK6Y2_9neNqx8fj-qTj-cnbyAB8FAR3vcgkE7neFLj39a8yr5-S944ABr |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Latent+source-specific+generative+factor+learning+for+monaural+speech+separation+using+weighted-factor+autoencoder&rft.jtitle=Frontiers+of+information+technology+%26+electronic+engineering&rft.au=Chen%2C+Jing-jing&rft.au=Mao%2C+Qi-rong&rft.au=Qin%2C+You-cai&rft.au=Qian%2C+Shuang-qing&rft.date=2020-11-01&rft.issn=2095-9184&rft.eissn=2095-9230&rft.volume=21&rft.issue=11&rft.spage=1639&rft.epage=1650&rft_id=info:doi/10.1631%2FFITEE.2000019&rft.externalDBID=n%2Fa&rft.externalDocID=10_1631_FITEE_2000019 |
thumbnail_s | http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=http%3A%2F%2Fwww.wanfangdata.com.cn%2Fimages%2FPeriodicalImages%2Fzjdxxbc-e%2Fzjdxxbc-e.jpg |