Speech training data set enhancement method and device, equipment and storage medium

The invention provides a speech training data set enhancement method and device, equipment and a storage medium. The method comprises steps of extracting a Mel spectrogram corresponding to each piece of speech training data, carrying out pixel point rearrangement processing, and obtaining a temporar...

Full description

Saved in:
Bibliographic Details
Main Authors WANG JIANZONG, TANG YANXI, QU XIAOYANG
Format Patent
LanguageChinese
English
Published 10.08.2021
Subjects
Online AccessGet full text

Cover

Loading…
Abstract The invention provides a speech training data set enhancement method and device, equipment and a storage medium. The method comprises steps of extracting a Mel spectrogram corresponding to each piece of speech training data, carrying out pixel point rearrangement processing, and obtaining a temporary Mel spectrogram after the pixel point rearrangement processing, and setting an erasure region area, setting a shape parameter, a change parameter or a random erasure coefficient of an erasure region for each temporary Mel frequency spectrum, obtaining a plurality of extended Mel frequency spectrograms, and converting each extended Mel frequency spectrogram into corresponding target voice training data, thereby completing the supplement of the voice training data. The method is advantaged in that a problem that the voice model is prone to overfitting in the training process due to less voice training data is solved, robustness of the voice model is improved, the voice model is prevented from being caught in overfi
AbstractList The invention provides a speech training data set enhancement method and device, equipment and a storage medium. The method comprises steps of extracting a Mel spectrogram corresponding to each piece of speech training data, carrying out pixel point rearrangement processing, and obtaining a temporary Mel spectrogram after the pixel point rearrangement processing, and setting an erasure region area, setting a shape parameter, a change parameter or a random erasure coefficient of an erasure region for each temporary Mel frequency spectrum, obtaining a plurality of extended Mel frequency spectrograms, and converting each extended Mel frequency spectrogram into corresponding target voice training data, thereby completing the supplement of the voice training data. The method is advantaged in that a problem that the voice model is prone to overfitting in the training process due to less voice training data is solved, robustness of the voice model is improved, the voice model is prevented from being caught in overfi
Author TANG YANXI
QU XIAOYANG
WANG JIANZONG
Author_xml – fullname: WANG JIANZONG
– fullname: TANG YANXI
– fullname: QU XIAOYANG
BookMark eNqNijsOwjAQBV1Awe8OSw8SSRA9ikBUNKSPVvZLYomsTbzh_HzEAahGmpm5mUgQzEx1i4DtSAf24qUlx8qUoATpWCx6iFIP7YIjFkcOT2-xITxGH7_xY5OGgVu8R-fHfmmmDd8TVj8uzPp8qsrLFjHUSJEtBFqX1ywr8n22O-TH4p_nBakGOjI
ContentType Patent
DBID EVB
DatabaseName esp@cenet
DatabaseTitleList
Database_xml – sequence: 1
  dbid: EVB
  name: esp@cenet
  url: http://worldwide.espacenet.com/singleLineSearch?locale=en_EP
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
Discipline Medicine
Chemistry
Sciences
Physics
DocumentTitleAlternate 语音训练数据集的增强方法、装置、设备及存储介质
ExternalDocumentID CN113241062A
GroupedDBID EVB
ID FETCH-epo_espacenet_CN113241062A3
IEDL.DBID EVB
IngestDate Fri Jul 19 13:07:10 EDT 2024
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language Chinese
English
LinkModel DirectLink
MergedId FETCHMERGED-epo_espacenet_CN113241062A3
Notes Application Number: CN202110610940
OpenAccessLink https://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20210810&DB=EPODOC&CC=CN&NR=113241062A
ParticipantIDs epo_espacenet_CN113241062A
PublicationCentury 2000
PublicationDate 20210810
PublicationDateYYYYMMDD 2021-08-10
PublicationDate_xml – month: 08
  year: 2021
  text: 20210810
  day: 10
PublicationDecade 2020
PublicationYear 2021
RelatedCompanies PING AN TECHNOLOGY (SHENZHEN) CO., LTD
RelatedCompanies_xml – name: PING AN TECHNOLOGY (SHENZHEN) CO., LTD
Score 3.4752572
Snippet The invention provides a speech training data set enhancement method and device, equipment and a storage medium. The method comprises steps of extracting a Mel...
SourceID epo
SourceType Open Access Repository
SubjectTerms ACOUSTICS
MUSICAL INSTRUMENTS
PHYSICS
SPEECH ANALYSIS OR SYNTHESIS
SPEECH OR AUDIO CODING OR DECODING
SPEECH OR VOICE PROCESSING
SPEECH RECOGNITION
Title Speech training data set enhancement method and device, equipment and storage medium
URI https://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20210810&DB=EPODOC&locale=&CC=CN&NR=113241062A
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3dS8NADA9zfr7pVHR-cIL0yaK9tmt9KOKuG0OwGzplb2O9pmyC3bQtgn-9uXNzvuhrDo67HEkuyS8JwHnKExfJ0Jke96XpoJOafmJzU1qxbdmNaytBDZCNGp0n527gDirwsqiF0X1CP3RzRJIoSfJeaH09WwaxQo2tzC_jCZGmN-1-EBpz75j8F9-6MsJm0Op1w64whAhEZEQPgRqo7pD7w29XYJW-0Z6Cf7Wem6oqZfbbpLS3Ya1Hu2XFDlQ-xzXYFIvJazXYuJ8nvGuwrhGaMifiXArzXeg_zhDlmC3GOzAF82Q5FgyzsXpFFfFj37Oh2ShLWIJKH1wwfCsnGh-kqQoXSdqEqfR6-boHZ-1WX3RMOujwhytDES3vZO9DNZtmeABMknbjquWgHacOchxxjFPPjT3ppH7asA6h_vc-9f8Wj2BLcdjU_WCPoVq8l3hCFrmITzUrvwAfeZAx
link.rule.ids 230,309,783,888,25578,76884
linkProvider European Patent Office
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3dT8IwEL8gfuCbokbxqyZmTy6ybsB8WIx0LKgwiE7DG2HdLWDiQLfFxL_etoL4oq_XpGmvubte-7vfAZzHNKqhCHR6g9pct9CKdTsyqc6N0DTM-pURoQLI-vX2k3U3qA0K8LKohVE8oR-KHFFYFBf2nil_PVs-YrkKW5lehhMhml57geNq8-xY5C-2UdXcptPq99we0xhzmK_5D45sqG6J9IferMCquGLbkme_9dyUVSmz3yHF24K1vpgtybah8DkuQ4ktOq-VYaM7__Auw7pCaPJUCOdWmO5A8DhD5GOyaO9AJMyTpJgRTMbyFOWLH_nuDU1GSUQilP7gguBbPlH4ICWVuEjhTYj8Xs9fd-HMawWsrYuFDn-0MmT-ck_mHhSTaYL7QLjwblRSDpphbCHFEcUwbtTCBrdiO64bB1D5e57Kf4OnUGoH3c6wc-vfH8Km1LauuGGPoJi953gsonMWnii1fgELXpMh
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Apatent&rft.title=Speech+training+data+set+enhancement+method+and+device%2C+equipment+and+storage+medium&rft.inventor=WANG+JIANZONG&rft.inventor=TANG+YANXI&rft.inventor=QU+XIAOYANG&rft.date=2021-08-10&rft.externalDBID=A&rft.externalDocID=CN113241062A