Techniques for disambiguating speech input using multimodal interfaces

A system for disambiguating speech input. The system comprises a speech recognition component (110) for receiving recorded audio or speech input (104) and for generating one or more tokens corresponding to said input and a confidence value for each of said one or more tokens, the confidence value be...

Full description

Saved in:

Bibliographic Details
Main Authors	SIBAL, SANDEEP, VAIDYA, SHIRISH, DOMINACH, RICHARD, ISUKAPALLI, SASTRY
Format	Patent
Language	English French German
Published	21.01.2009
Subjects	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online Access	Get full text

Cover

Loading…

Abstract	A system for disambiguating speech input. The system comprises a speech recognition component (110) for receiving recorded audio or speech input (104) and for generating one or more tokens corresponding to said input and a confidence value for each of said one or more tokens, the confidence value being indicative of a likelihood that said token correctly represents the respective input. The system also comprises a selection component (116) for identifying, according to a selection algorithm, which of any two or more tokens generated for said input are to be presented to a user (108) as alternatives (120); one or more disambiguation components (118,124) for performing an interaction with the user, in which the alternatives are presented to the user and the user's selection (122) is received; and an output interface (126) for presenting the user's selection as an input to an application (106). The system is characterised in that said interaction with the user uses a multimodal interface and said alternatives are presented to the user as a multimodal output and the user's selection is received as a multimodal input.
AbstractList	A system for disambiguating speech input. The system comprises a speech recognition component (110) for receiving recorded audio or speech input (104) and for generating one or more tokens corresponding to said input and a confidence value for each of said one or more tokens, the confidence value being indicative of a likelihood that said token correctly represents the respective input. The system also comprises a selection component (116) for identifying, according to a selection algorithm, which of any two or more tokens generated for said input are to be presented to a user (108) as alternatives (120); one or more disambiguation components (118,124) for performing an interaction with the user, in which the alternatives are presented to the user and the user's selection (122) is received; and an output interface (126) for presenting the user's selection as an input to an application (106). The system is characterised in that said interaction with the user uses a multimodal interface and said alternatives are presented to the user as a multimodal output and the user's selection is received as a multimodal input.
Author	DOMINACH, RICHARD SIBAL, SANDEEP ISUKAPALLI, SASTRY VAIDYA, SHIRISH
Author_xml	– fullname: SIBAL, SANDEEP – fullname: VAIDYA, SHIRISH – fullname: DOMINACH, RICHARD – fullname: ISUKAPALLI, SASTRY
BookMark	eNqNijsOwjAMQDPAwO8OvgASLQNdEWrFyNC9Mq1TLCVOqJP7EyQOwPSk997WrCQIbUzX0_gSfmdSsGGBiRX9k-eMiWUGjVQ6sMScIOtX-ewS-zChKzrRYnEk3Zu1Rad0-HFnoGv72_1IMQyksTxCaWgf9am6NHVzrc5_LB9EZTW9
ContentType	Patent
DBID	EVB
DatabaseName	esp@cenet
DatabaseTitleList
Database_xml	– sequence: 1 dbid: EVB name: esp@cenet url: http://worldwide.espacenet.com/singleLineSearch?locale=en_EP sourceTypes: Open Access Repository
DeliveryMethod	fulltext_linktorsrc
Discipline	Medicine Chemistry Sciences Physics
DocumentTitleAlternate	Techniken zur Disambiguierung von Spracheneingabe unter Verwendung multimodaler Schnittstellen Techniques pour résoudre l'ambiguïté d'entrées vocales à l'aide d'interfaces multimodales
ExternalDocumentID	EP2017828A1
GroupedDBID	EVB
ID	FETCH-epo_espacenet_EP2017828A13
IEDL.DBID	EVB
IngestDate	Fri Jul 19 15:37:07 EDT 2024
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	false
Language	English French German
LinkModel	DirectLink
MergedId	FETCHMERGED-epo_espacenet_EP2017828A13
Notes	Application Number: EP20080168464
OpenAccessLink	https://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20090121&DB=EPODOC&CC=EP&NR=2017828A1
ParticipantIDs	epo_espacenet_EP2017828A1
PublicationCentury	2000
PublicationDate	20090121
PublicationDateYYYYMMDD	2009-01-21
PublicationDate_xml	– month: 01 year: 2009 text: 20090121 day: 21
PublicationDecade	2000
PublicationYear	2009
RelatedCompanies	KIRUSA, INC
RelatedCompanies_xml	– name: KIRUSA, INC
Score	2.7283084
Snippet	A system for disambiguating speech input. The system comprises a speech recognition component (110) for receiving recorded audio or speech input (104) and for...
SourceID	epo
SourceType	Open Access Repository
SubjectTerms	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Title	Techniques for disambiguating speech input using multimodal interfaces
URI	https://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20090121&DB=EPODOC&locale=&CC=EP&NR=2017828A1
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1bS8MwFD6MeX3Tqjhv5EH6VrRtXNuHIa4XhuBWpMreRpu0peDasnb49z2J6_RF30ICIQl8OefkfOcLwC36wEPHuLc1nvJMowaNNZvpqeZw0-K2qYtcn2BbTIeTN_o8f5j3oOhqYaRO6KcUR0REMcR7K-_r-ucRy5PcyuYuKbCregyikad20bEjJMpUbzzyw5k3c1XXxZY6fRVPgWgL7ScMlHbQi7YEGPz3sShKqX9blOAIdkOcrGyPoZeWChy43cdrCuy_bPLdCuxJgiZrsHMDwuYEgqjTXW0IupyEF028TIpciHaXOWnqFMdJUdbrlghae04ka3BZ8fiDCHmIVSZ4WKdAAj9yJxoubbE9hoUfbjdhnkG_rMr0HAinBnMw6NKZY1Mm4BibGY8tKxOxFGUDGPw5zcU_Y5dw-J040TVDv4J-u1qn12h_2-RGntwXamiKjA
link.rule.ids	230,309,783,888,25576,76876
linkProvider	European Patent Office
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1bS8MwFD6MeZlvWpXNax6kb0V7cW0fhrhemLp1Q6rsbXRJWwquK2uHf9-TuE5f9C0kEJLAl3NOzne-ANygD9y1tTtLYTFLFEMzIsWiaqzYTDeZpas818fZFkF38GY8T--nDcjqWhihE_opxBERURTxXon7uvh5xHIFt7K8nWfYtXzww54r19GxzSXKZLff8yZjd-zIjoMtOXjlT4FoC61HDJR20MM2ORi89z4vSil-WxT_EHYnOFleHUEjziVoOfXHaxLsjzb5bgn2BEGTlti5AWF5DH5Y666WBF1OwrIyWsyzlIt25ykpixjHSZYX64pwWntKBGtwsWTRB-HyEKuE87BOgPhe6AwUXNpsewwzb7LdhH4KzXyZx20gzNCojUGXSm3LoByOkZ6wyDQTHksZtAOdP6c5-2fsGlqDcDScDZ-Cl3M4-E6iqIqmXkCzWq3jS7TF1fxKnOIXFgWNfw
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Apatent&rft.title=Techniques+for+disambiguating+speech+input+using+multimodal+interfaces&rft.inventor=SIBAL%2C+SANDEEP&rft.inventor=VAIDYA%2C+SHIRISH&rft.inventor=DOMINACH%2C+RICHARD&rft.inventor=ISUKAPALLI%2C+SASTRY&rft.date=2009-01-21&rft.externalDBID=A1&rft.externalDocID=EP2017828A1