Computationally Efficient Context-Free Named Entity Disambiguation with Wikipedia

The induction of the semantics of unstructured text corpora is a crucial task for modern natural language processing and artificial intelligence applications. The Named Entity Disambiguation task comprises the extraction of Named Entities and their linking to an appropriate representation from a con...

Full description

Saved in:
Bibliographic Details
Published inInformation (Basel) Vol. 13; no. 8; p. 367
Main Authors Simos, Michael Angelos, Makris, Christos
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.08.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The induction of the semantics of unstructured text corpora is a crucial task for modern natural language processing and artificial intelligence applications. The Named Entity Disambiguation task comprises the extraction of Named Entities and their linking to an appropriate representation from a concept ontology based on the available information. This work introduces novel methodologies, leveraging domain knowledge extraction from Wikipedia in a simple yet highly effective approach. In addition, we introduce a fuzzy logic model with a strong focus on computational efficiency. We also present a new measure, decisive in both methods for the entity linking selection and the quantification of the confidence of the produced entity links, namely the relative commonness measure. The experimental results of our approach on established datasets revealed state-of-the-art accuracy and run-time performance in the domain of fast, context-free Wikification, by relying on an offline pre-processing stage on the corpus of Wikipedia. The methods introduced can be leveraged as stand-alone NED methodologies, propitious for applications on mobile devices, or in the context of vastly reducing the complexity of deep neural network approaches as a first context-free layer.
ISSN:2078-2489
2078-2489
DOI:10.3390/info13080367