Gesture recognition method and device based on voice and video, equipment and medium

The invention provides a gesture recognition method and device based on voice and video, equipment and a medium, and relates to the field of data processing, and the method comprises the steps: obtaining video data and voice data, and carrying out the preprocessing of the video data and the voice da...

Full description

Saved in:

Bibliographic Details
Main Authors	LI CAIBO, GENG CHANGBIAO, WEN JINHONG, MA HAN
Format	Patent
Language	Chinese English
Published	20.10.2023
Subjects	ACOUSTICS CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online Access	Get full text

Cover

Loading…

Abstract	The invention provides a gesture recognition method and device based on voice and video, equipment and a medium, and relates to the field of data processing, and the method comprises the steps: obtaining video data and voice data, and carrying out the preprocessing of the video data and the voice data, and obtaining first processing data and second processing data; establishing an initial model, and training by adopting the training data to obtain a target model; wherein the target model comprises a first feature extraction network and a second feature extraction network formed by a plurality of stacked convolution blocks, and a feature fusion module formed by a plurality of convolution layers; performing image feature extraction on the first processing data through the first feature extraction network to obtain a first feature; performing voice feature extraction on the second processing data through the second feature extraction network to obtain a second feature; and based on the first feature and the seco
AbstractList	The invention provides a gesture recognition method and device based on voice and video, equipment and a medium, and relates to the field of data processing, and the method comprises the steps: obtaining video data and voice data, and carrying out the preprocessing of the video data and the voice data, and obtaining first processing data and second processing data; establishing an initial model, and training by adopting the training data to obtain a target model; wherein the target model comprises a first feature extraction network and a second feature extraction network formed by a plurality of stacked convolution blocks, and a feature fusion module formed by a plurality of convolution layers; performing image feature extraction on the first processing data through the first feature extraction network to obtain a first feature; performing voice feature extraction on the second processing data through the second feature extraction network to obtain a second feature; and based on the first feature and the seco
Author	WEN JINHONG MA HAN GENG CHANGBIAO LI CAIBO
Author_xml	– fullname: LI CAIBO – fullname: GENG CHANGBIAO – fullname: WEN JINHONG – fullname: MA HAN
BookMark	eNqNirkKwkAURafQwu0fnr2CIRBIKcGlskofxsxVB5z3xszy_RrxA6wu55w7VxMWxky1J4SYBtCAXu5soxUmh_gQQ5oNGWTbg646wNAnZRlxLNkayIbwStY7cPxKB2OTW6rpTT8DVr9dqPXx0DbnLbx0CF73YMSuuRRFVRe7qi735T-fN57rOhk
ContentType	Patent
DBID	EVB
DatabaseName	esp@cenet
DatabaseTitleList
Database_xml	– sequence: 1 dbid: EVB name: esp@cenet url: http://worldwide.espacenet.com/singleLineSearch?locale=en_EP sourceTypes: Open Access Repository
DeliveryMethod	fulltext_linktorsrc
Discipline	Medicine Chemistry Sciences Physics
DocumentTitleAlternate	一种基于语音和视频的手势识别方法、装置、设备及介质
ExternalDocumentID	CN116910693A
GroupedDBID	EVB
ID	FETCH-epo_espacenet_CN116910693A3
IEDL.DBID	EVB
IngestDate	Fri Jul 19 13:09:20 EDT 2024
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	false
Language	Chinese English
LinkModel	DirectLink
MergedId	FETCHMERGED-epo_espacenet_CN116910693A3
Notes	Application Number: CN202310942829
OpenAccessLink	https://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20231020&DB=EPODOC&CC=CN&NR=116910693A
ParticipantIDs	epo_espacenet_CN116910693A
PublicationCentury	2000
PublicationDate	20231020
PublicationDateYYYYMMDD	2023-10-20
PublicationDate_xml	– month: 10 year: 2023 text: 20231020 day: 20
PublicationDecade	2020
PublicationYear	2023
RelatedCompanies	ZHAOTONG LIANGFENGTAI INFORMATION TECHNOLOGY CO., LTD
RelatedCompanies_xml	– name: ZHAOTONG LIANGFENGTAI INFORMATION TECHNOLOGY CO., LTD
Score	3.6348712
Snippet	The invention provides a gesture recognition method and device based on voice and video, equipment and a medium, and relates to the field of data processing,...
SourceID	epo
SourceType	Open Access Repository
SubjectTerms	ACOUSTICS CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Title	Gesture recognition method and device based on voice and video, equipment and medium
URI	https://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20231020&DB=EPODOC&locale=&CC=CN&NR=116910693A
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3dS8MwED_m_HzT6dD5QQTpk8W2a-v6UMSlq0OwGzJlb2NtEpxgO7dWwb_eXOw2X_QxdxCSg_vK3f0CcMGEzYXJHP06MWNd5l-J7rlxrDMZPI8dpxULodA-I7f7ZN8PnWEFXhezMAon9FOBI0qNSqS-58peT1ePWIHqrZxfxRNJym7CgR9oZXaMwYplaEHb7_R7QY9qlPo00qJH30RQGMP1mrdrsI5hNOLsd57bOJUy_e1Swl3Y6Mvd0nwPKl8vNdimi5_XarD1UBa8a7CpOjSTuSSWWjjfh8GdNOXFjJNl90-Wkp-voMk4ZYRxVH-CDooRyfrIcIkcHLrLLgl_LyaqT0gRsbxevB3AedgZ0K4uDzpaSmVEo9WdmnWoplnKD4F4MqJILMMWDndsbgkPHXDTlHJnriFarSNo_L1P4z_mMeyghNFwW8YJVPNZwU-lR87jMyXKbzg-kA0
link.rule.ids	230,309,786,891,25594,76904
linkProvider	European Patent Office
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LT8JAEJ4gPvCmqFF8rYnpyca2tIUeGiMtiArFmGq4EdrdjZjYIrSa-OudXXl40ePOJJvdSea1O_MNwDnlJuM6tdRarEcq5l-x6thRpFIMnoeWVY84l2ifgd1-Mu_6Vr8Ar_NeGIkT-inBEVGjYtT3TNrr8fIRy5e1ldPLaISk9KoVur4yy45FsGJoit9wmw89v-cpnud6gRI8uroAhdFsp3q9Aqs1TAkFzn7zuSG6Usa_XUprC9YecLck24bC10sZSt588loZNrqzD-8yrMsKzXiKxJkWTncgvEFTnk8YWVT_pAn5GQVNhgkllAn1J8JBUYKsj1QsBUc03aUXhL3nI1knJIniez1_24WzVjP02ioedLCQysALlneq7kExSRO2D8TBiCI2NJNbzDKZwR3hgKs6yp3aGq_XD6Dy9z6V_5inUGqH3c6gcxvcH8KmkLYw4oZ2BMVskrNj9M5ZdCLF-g0bxZL4
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Apatent&rft.title=Gesture+recognition+method+and+device+based+on+voice+and+video%2C+equipment+and+medium&rft.inventor=LI+CAIBO&rft.inventor=GENG+CHANGBIAO&rft.inventor=WEN+JINHONG&rft.inventor=MA+HAN&rft.date=2023-10-20&rft.externalDBID=A&rft.externalDocID=CN116910693A