DEXTER: Deep Encoding of External Knowledge for Named Entity Recognition in Virtual Assistants
Named entity recognition (NER) is usually developed and tested on text from well-written sources. However, in intelligent voice assistants, where NER is an important component, input to NER may be noisy because of user or speech recognition error. In applications, entity labels may change frequently...
Saved in:
Published in | arXiv.org |
---|---|
Main Authors | , , , , , , , , |
Format | Paper Journal Article |
Language | English |
Published |
Ithaca
Cornell University Library, arXiv.org
15.08.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Named entity recognition (NER) is usually developed and tested on text from well-written sources. However, in intelligent voice assistants, where NER is an important component, input to NER may be noisy because of user or speech recognition error. In applications, entity labels may change frequently, and non-textual properties like topicality or popularity may be needed to choose among alternatives. We describe a NER system intended to address these problems. We test and train this system on a proprietary user-derived dataset. We compare with a baseline text-only NER system; the baseline enhanced with external gazetteers; and the baseline enhanced with the search and indirect labelling techniques we describe below. The final configuration gives around 6% reduction in NER error rate. We also show that this technique improves related tasks, such as semantic parsing, with an improvement of up to 5% in error rate. |
---|---|
AbstractList | Named entity recognition (NER) is usually developed and tested on text from
well-written sources. However, in intelligent voice assistants, where NER is an
important component, input to NER may be noisy because of user or speech
recognition error. In applications, entity labels may change frequently, and
non-textual properties like topicality or popularity may be needed to choose
among alternatives.
We describe a NER system intended to address these problems. We test and
train this system on a proprietary user-derived dataset. We compare with a
baseline text-only NER system; the baseline enhanced with external gazetteers;
and the baseline enhanced with the search and indirect labelling techniques we
describe below. The final configuration gives around 6% reduction in NER error
rate. We also show that this technique improves related tasks, such as semantic
parsing, with an improvement of up to 5% in error rate. Named entity recognition (NER) is usually developed and tested on text from well-written sources. However, in intelligent voice assistants, where NER is an important component, input to NER may be noisy because of user or speech recognition error. In applications, entity labels may change frequently, and non-textual properties like topicality or popularity may be needed to choose among alternatives. We describe a NER system intended to address these problems. We test and train this system on a proprietary user-derived dataset. We compare with a baseline text-only NER system; the baseline enhanced with external gazetteers; and the baseline enhanced with the search and indirect labelling techniques we describe below. The final configuration gives around 6% reduction in NER error rate. We also show that this technique improves related tasks, such as semantic parsing, with an improvement of up to 5% in error rate. |
Author | Barnes, Megan Li, Lin Pan, Jingjing Acero, Alex Muralidharan, Deepak Zhang, Weicheng Pulman, Stephen Williams, Jason Joel Ruben Antony Moniz |
Author_xml | – sequence: 1 givenname: Deepak surname: Muralidharan fullname: Muralidharan, Deepak – sequence: 2 fullname: Joel Ruben Antony Moniz – sequence: 3 givenname: Weicheng surname: Zhang fullname: Zhang, Weicheng – sequence: 4 givenname: Stephen surname: Pulman fullname: Pulman, Stephen – sequence: 5 givenname: Lin surname: Li fullname: Li, Lin – sequence: 6 givenname: Megan surname: Barnes fullname: Barnes, Megan – sequence: 7 givenname: Jingjing surname: Pan fullname: Pan, Jingjing – sequence: 8 givenname: Jason surname: Williams fullname: Williams, Jason – sequence: 9 givenname: Alex surname: Acero fullname: Acero, Alex |
BackLink | https://doi.org/10.48550/arXiv.2108.06633$$DView paper in arXiv https://doi.org/10.21437/Interspeech.2021-1877$$DView published paper (Access to full text may be restricted) |
BookMark | eNotkE1PAjEURRujiYj8AFc2cT34pu10ijsC40ckmhBiXDkpM6-kBFqcFoV_7wCu7ubcm5x7Rc6dd0jITQp9obIM7nWzsz99loLqg5Scn5EO4zxNlGDskvRCWAIAkznLMt4hX-Pic1ZMH-gYcUMLV_naugX1hha7iI3TK_rq_O8K6wVS4xv6ptdYt2C0cU-nWPmFs9F6R62jH7aJ27YxDMGGqF0M1-TC6FXA3n92yeyxmI2ek8n708toOEn0IOOJUQOBKatyNpcIlUZjDBfK1HMYyNoI5BkTslUSkCLIFCvODU8hVwxyQMG75PY0e5QvN41d62ZfHk4ojye0xN2J2DT-e4shlku_PeiFkmWSgVIH6g9ssGAO |
ContentType | Paper Journal Article |
Copyright | 2021. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. http://arxiv.org/licenses/nonexclusive-distrib/1.0 |
Copyright_xml | – notice: 2021. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. – notice: http://arxiv.org/licenses/nonexclusive-distrib/1.0 |
DBID | 8FE 8FG ABJCF ABUWG AFKRA AZQEC BENPR BGLVJ CCPQU DWQXO HCIFZ L6V M7S PIMPY PQEST PQQKQ PQUKI PRINS PTHSS AKY GOX |
DOI | 10.48550/arxiv.2108.06633 |
DatabaseName | ProQuest SciTech Collection ProQuest Technology Collection Materials Science & Engineering Collection ProQuest Central (Alumni) ProQuest Central ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One Community College ProQuest Central Korea SciTech Premium Collection ProQuest Engineering Collection Engineering Database Publicly Available Content Database ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China Engineering Collection arXiv Computer Science arXiv.org |
DatabaseTitle | Publicly Available Content Database Engineering Database Technology Collection ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Technology Collection ProQuest SciTech Collection ProQuest Central China ProQuest Central ProQuest Engineering Collection ProQuest One Academic UKI Edition ProQuest Central Korea Materials Science & Engineering Collection ProQuest One Academic Engineering Collection |
DatabaseTitleList | Publicly Available Content Database |
Database_xml | – sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository – sequence: 2 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Physics |
EISSN | 2331-8422 |
ExternalDocumentID | 2108_06633 |
Genre | Working Paper/Pre-Print |
GroupedDBID | 8FE 8FG ABJCF ABUWG AFKRA ALMA_UNASSIGNED_HOLDINGS AZQEC BENPR BGLVJ CCPQU DWQXO FRJ HCIFZ L6V M7S M~E PIMPY PQEST PQQKQ PQUKI PRINS PTHSS AKY GOX |
ID | FETCH-LOGICAL-a953-f894e12c72b6e0caefff348fdb096df4e35246210401e061ec33f310782070e43 |
IEDL.DBID | BENPR |
IngestDate | Mon Jan 08 05:42:44 EST 2024 Thu Oct 10 19:23:53 EDT 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-a953-f894e12c72b6e0caefff348fdb096df4e35246210401e061ec33f310782070e43 |
OpenAccessLink | https://www.proquest.com/docview/2562088633?pq-origsite=%requestingapplication% |
PQID | 2562088633 |
PQPubID | 2050157 |
ParticipantIDs | arxiv_primary_2108_06633 proquest_journals_2562088633 |
PublicationCentury | 2000 |
PublicationDate | 20210815 |
PublicationDateYYYYMMDD | 2021-08-15 |
PublicationDate_xml | – month: 08 year: 2021 text: 20210815 day: 15 |
PublicationDecade | 2020 |
PublicationPlace | Ithaca |
PublicationPlace_xml | – name: Ithaca |
PublicationTitle | arXiv.org |
PublicationYear | 2021 |
Publisher | Cornell University Library, arXiv.org |
Publisher_xml | – name: Cornell University Library, arXiv.org |
SSID | ssj0002672553 |
Score | 1.8147776 |
SecondaryResourceType | preprint |
Snippet | Named entity recognition (NER) is usually developed and tested on text from well-written sources. However, in intelligent voice assistants, where NER is an... Named entity recognition (NER) is usually developed and tested on text from well-written sources. However, in intelligent voice assistants, where NER is an... |
SourceID | arxiv proquest |
SourceType | Open Access Repository Aggregation Database |
SubjectTerms | Computer Science - Artificial Intelligence Computer Science - Computation and Language Computer Science - Learning Errors Labels Speech recognition |
SummonAdditionalLinks | – databaseName: arXiv.org dbid: GOX link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV3BToNAEN3UnrwYjZpWq9mD141ld6HgzVhqo7EmphpOEmaZTXqBpq1G_97ZBeLBeIWBhDebeW9g9sHYlY3CErQBUeigFJoIQACAFGiS0qKloul9tp8W0fxVP2Rh1mO82wtTbL5Wn40_MGyvqR9x9pqRUntsT0o3snX_nDUfJ70VVxv_G0ca0x_6U1o9X8wO2UEr9Phtk5kj1sPqmL1P04z04w2fIq55WpnaUQevLU9bN2b-2L3k4iQn-aIgtqJAyuk3f-mGfeqKryr-ttq4zR-cEHYisNptT9hyli7v5qL9w4EoklAJGycaA2kmEiIcmwKttUrHtgRqLEqrkdSRjujhqAlCIl40SlnSY87jbjJGrU5Zv6orHDAeuNvEMZDAQ-Jom0CcmHEEMgETaIAhG3hc8nVjYpE7yHIP2ZCNOqjydgFvc1JCkgoQnT77_8pzti_diIdziA1HrL_bfOAFcfQOLn2ifgDF3ZEE priority: 102 providerName: Cornell University |
Title | DEXTER: Deep Encoding of External Knowledge for Named Entity Recognition in Virtual Assistants |
URI | https://www.proquest.com/docview/2562088633 https://arxiv.org/abs/2108.06633 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3PT4MwFG7ciIk3f2bqXHrwWgelY-DFRMe2aDaXZRpOElraZBdAmEYv_u2-dqAHEy8k_AgJr-V933t9_R5Cl8obpJwJThLmpIQBABDOOSVSBKmSCpym0dmezb3pE7uPBlGdcKvqssrGJxpHneZC58j7AM0U_gjPdW-KV6K7RunV1bqFRgtZFCIFu42s23C-WP5kWag3BM7sbpczjXhXPyk_1u9XEOlo4U5PN8y1zKU_ztggzHgfWYukkOUB2pHZIdo1hZmiOkIvozACxnmNR1IWOMxErsEG5wqHtX4zfmjSYhgIKJ4ngG_wIMyCT7xsyoPyDK8z_Lwu9XYRDGOiaWO2qY7Rahyu7qak7olAkmDgEuUHTDpUDCn3pC0SqZRyma9SDqFIqpgEPsU8-DgImyRAtRSuq4DBaVW8oS2Ze4LaWZ7JDsKOfo3vc6CEElBdBdwPhO1xGnDhMM5PUcfYJS62shexNllsTHaKuo2p4nrKV_HvAJ39f_sc7VFdGKJ1ZQdd1N6Ub_ICkH3De6jljye9ehDhbPIYwXH2FX4D_XemKg |
link.rule.ids | 228,230,783,787,888,12777,21400,27937,33385,33756,43612,43817 |
linkProvider | ProQuest |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV09T8MwELWgFYKNT1Eo4IE1tI0dN2FhoCmFfgihgjoRxc5Z6pKEpCD495zdBAYk1iSKlDvn3rvz-R0hl1p4ieRKOjHvJQ5HAHCklK4DKkg0aAyaVmd7OhOjZ_6w8BZVwa2s2irrmGgDdZIpUyPvIDS7-EcIxm7yN8dMjTK7q9UIjU3S5Ayx2pwUH9791Fhc0UfGzNabmVa6qxMXn8uPK8xzjGynMONym_bSn1Bs8WW4S5qPcQ7FHtmAdJ9s2bZMVR6Q10G4QL55TQcAOQ1TlRmooZmmYaXeTMd1UYwi_aSzGNENH8Q18EWf6uagLKXLlL4sC3NYhKJHDGlMV-UhmQ_D-e3IqSYiOHHgMUf7AYeeq_quFNBVMWitGfd1IjERSTQHZFNc4Mdh0gQI1KAY08jfjCZevwucHZFGmqVwTGjPvMb3JRJCQEzXgfQD1RXSDaTqcSlb5NjaJcrXoheRMVlkTdYi7dpUUbXgy-jXPSf_374g26P5dBJN7mfjU7LjmhYRozDrtUljVbzDGWL8Sp5bR34Dawakjw |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=DEXTER%3A+Deep+Encoding+of+External+Knowledge+for+Named+Entity+Recognition+in+Virtual+Assistants&rft.jtitle=arXiv.org&rft.au=Muralidharan%2C+Deepak&rft.au=Joel+Ruben+Antony+Moniz&rft.au=Zhang%2C+Weicheng&rft.au=Pulman%2C+Stephen&rft.date=2021-08-15&rft.pub=Cornell+University+Library%2C+arXiv.org&rft.eissn=2331-8422&rft_id=info:doi/10.48550%2Farxiv.2108.06633 |