Speech training data set enhancement method and device, equipment and storage medium
The invention provides a speech training data set enhancement method and device, equipment and a storage medium. The method comprises steps of extracting a Mel spectrogram corresponding to each piece of speech training data, carrying out pixel point rearrangement processing, and obtaining a temporar...
Saved in:
Main Authors | , , |
---|---|
Format | Patent |
Language | Chinese English |
Published |
10.08.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | The invention provides a speech training data set enhancement method and device, equipment and a storage medium. The method comprises steps of extracting a Mel spectrogram corresponding to each piece of speech training data, carrying out pixel point rearrangement processing, and obtaining a temporary Mel spectrogram after the pixel point rearrangement processing, and setting an erasure region area, setting a shape parameter, a change parameter or a random erasure coefficient of an erasure region for each temporary Mel frequency spectrum, obtaining a plurality of extended Mel frequency spectrograms, and converting each extended Mel frequency spectrogram into corresponding target voice training data, thereby completing the supplement of the voice training data. The method is advantaged in that a problem that the voice model is prone to overfitting in the training process due to less voice training data is solved, robustness of the voice model is improved, the voice model is prevented from being caught in overfi |
---|---|
AbstractList | The invention provides a speech training data set enhancement method and device, equipment and a storage medium. The method comprises steps of extracting a Mel spectrogram corresponding to each piece of speech training data, carrying out pixel point rearrangement processing, and obtaining a temporary Mel spectrogram after the pixel point rearrangement processing, and setting an erasure region area, setting a shape parameter, a change parameter or a random erasure coefficient of an erasure region for each temporary Mel frequency spectrum, obtaining a plurality of extended Mel frequency spectrograms, and converting each extended Mel frequency spectrogram into corresponding target voice training data, thereby completing the supplement of the voice training data. The method is advantaged in that a problem that the voice model is prone to overfitting in the training process due to less voice training data is solved, robustness of the voice model is improved, the voice model is prevented from being caught in overfi |
Author | TANG YANXI QU XIAOYANG WANG JIANZONG |
Author_xml | – fullname: WANG JIANZONG – fullname: TANG YANXI – fullname: QU XIAOYANG |
BookMark | eNqNijsOwjAQBV1Awe8OSw8SSRA9ikBUNKSPVvZLYomsTbzh_HzEAahGmpm5mUgQzEx1i4DtSAf24qUlx8qUoATpWCx6iFIP7YIjFkcOT2-xITxGH7_xY5OGgVu8R-fHfmmmDd8TVj8uzPp8qsrLFjHUSJEtBFqX1ywr8n22O-TH4p_nBakGOjI |
ContentType | Patent |
DBID | EVB |
DatabaseName | esp@cenet |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: EVB name: esp@cenet url: http://worldwide.espacenet.com/singleLineSearch?locale=en_EP sourceTypes: Open Access Repository |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Medicine Chemistry Sciences Physics |
DocumentTitleAlternate | 语音训练数据集的增强方法、装置、设备及存储介质 |
ExternalDocumentID | CN113241062A |
GroupedDBID | EVB |
ID | FETCH-epo_espacenet_CN113241062A3 |
IEDL.DBID | EVB |
IngestDate | Fri Jul 19 13:07:10 EDT 2024 |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | Chinese English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-epo_espacenet_CN113241062A3 |
Notes | Application Number: CN202110610940 |
OpenAccessLink | https://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20210810&DB=EPODOC&CC=CN&NR=113241062A |
ParticipantIDs | epo_espacenet_CN113241062A |
PublicationCentury | 2000 |
PublicationDate | 20210810 |
PublicationDateYYYYMMDD | 2021-08-10 |
PublicationDate_xml | – month: 08 year: 2021 text: 20210810 day: 10 |
PublicationDecade | 2020 |
PublicationYear | 2021 |
RelatedCompanies | PING AN TECHNOLOGY (SHENZHEN) CO., LTD |
RelatedCompanies_xml | – name: PING AN TECHNOLOGY (SHENZHEN) CO., LTD |
Score | 3.4752572 |
Snippet | The invention provides a speech training data set enhancement method and device, equipment and a storage medium. The method comprises steps of extracting a Mel... |
SourceID | epo |
SourceType | Open Access Repository |
SubjectTerms | ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION |
Title | Speech training data set enhancement method and device, equipment and storage medium |
URI | https://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20210810&DB=EPODOC&locale=&CC=CN&NR=113241062A |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3dS8NADA9zfr7pVHR-cIL0yaK9tmt9KOKuG0OwGzplb2O9pmyC3bQtgn-9uXNzvuhrDo67HEkuyS8JwHnKExfJ0Jke96XpoJOafmJzU1qxbdmNaytBDZCNGp0n527gDirwsqiF0X1CP3RzRJIoSfJeaH09WwaxQo2tzC_jCZGmN-1-EBpz75j8F9-6MsJm0Op1w64whAhEZEQPgRqo7pD7w29XYJW-0Z6Cf7Wem6oqZfbbpLS3Ya1Hu2XFDlQ-xzXYFIvJazXYuJ8nvGuwrhGaMifiXArzXeg_zhDlmC3GOzAF82Q5FgyzsXpFFfFj37Oh2ShLWIJKH1wwfCsnGh-kqQoXSdqEqfR6-boHZ-1WX3RMOujwhytDES3vZO9DNZtmeABMknbjquWgHacOchxxjFPPjT3ppH7asA6h_vc-9f8Wj2BLcdjU_WCPoVq8l3hCFrmITzUrvwAfeZAx |
link.rule.ids | 230,309,783,888,25578,76884 |
linkProvider | European Patent Office |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3dT8IwEL8gfuCbokbxqyZmTy6ybsB8WIx0LKgwiE7DG2HdLWDiQLfFxL_etoL4oq_XpGmvubte-7vfAZzHNKqhCHR6g9pct9CKdTsyqc6N0DTM-pURoQLI-vX2k3U3qA0K8LKohVE8oR-KHFFYFBf2nil_PVs-YrkKW5lehhMhml57geNq8-xY5C-2UdXcptPq99we0xhzmK_5D45sqG6J9IferMCquGLbkme_9dyUVSmz3yHF24K1vpgtybah8DkuQ4ktOq-VYaM7__Auw7pCaPJUCOdWmO5A8DhD5GOyaO9AJMyTpJgRTMbyFOWLH_nuDU1GSUQilP7gguBbPlH4ICWVuEjhTYj8Xs9fd-HMawWsrYuFDn-0MmT-ck_mHhSTaYL7QLjwblRSDpphbCHFEcUwbtTCBrdiO64bB1D5e57Kf4OnUGoH3c6wc-vfH8Km1LauuGGPoJi953gsonMWnii1fgELXpMh |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Apatent&rft.title=Speech+training+data+set+enhancement+method+and+device%2C+equipment+and+storage+medium&rft.inventor=WANG+JIANZONG&rft.inventor=TANG+YANXI&rft.inventor=QU+XIAOYANG&rft.date=2021-08-10&rft.externalDBID=A&rft.externalDocID=CN113241062A |