Computationally Efficient Context-Free Named Entity Disambiguation with Wikipedia
The induction of the semantics of unstructured text corpora is a crucial task for modern natural language processing and artificial intelligence applications. The Named Entity Disambiguation task comprises the extraction of Named Entities and their linking to an appropriate representation from a con...
Saved in:
Published in | Information (Basel) Vol. 13; no. 8; p. 367 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Basel
MDPI AG
01.08.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The induction of the semantics of unstructured text corpora is a crucial task for modern natural language processing and artificial intelligence applications. The Named Entity Disambiguation task comprises the extraction of Named Entities and their linking to an appropriate representation from a concept ontology based on the available information. This work introduces novel methodologies, leveraging domain knowledge extraction from Wikipedia in a simple yet highly effective approach. In addition, we introduce a fuzzy logic model with a strong focus on computational efficiency. We also present a new measure, decisive in both methods for the entity linking selection and the quantification of the confidence of the produced entity links, namely the relative commonness measure. The experimental results of our approach on established datasets revealed state-of-the-art accuracy and run-time performance in the domain of fast, context-free Wikification, by relying on an offline pre-processing stage on the corpus of Wikipedia. The methods introduced can be leveraged as stand-alone NED methodologies, propitious for applications on mobile devices, or in the context of vastly reducing the complexity of deep neural network approaches as a first context-free layer. |
---|---|
ISSN: | 2078-2489 2078-2489 |
DOI: | 10.3390/info13080367 |