Computationally Efficient Context-Free Named Entity Disambiguation with Wikipedia

The induction of the semantics of unstructured text corpora is a crucial task for modern natural language processing and artificial intelligence applications. The Named Entity Disambiguation task comprises the extraction of Named Entities and their linking to an appropriate representation from a con...

Full description

Saved in:

Bibliographic Details
Published in	Information (Basel) Vol. 13; no. 8; p. 367
Main Authors	Simos, Michael Angelos, Makris, Christos
Format	Journal Article
Language	English
Published	Basel MDPI AG 01.08.2022
Subjects	Accuracy Artificial intelligence Artificial neural networks Computational efficiency Context context-free Wikification Deep learning Domains Efficiency Electronic devices Encyclopedias Fuzzy logic Graph representations Knowledge acquisition Knowledge representation Machine learning Methods named entity disambiguation Natural language processing Neural networks ontologies Ontology Semantics text annotation Unstructured data Wikification word sense disambiguation
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The induction of the semantics of unstructured text corpora is a crucial task for modern natural language processing and artificial intelligence applications. The Named Entity Disambiguation task comprises the extraction of Named Entities and their linking to an appropriate representation from a concept ontology based on the available information. This work introduces novel methodologies, leveraging domain knowledge extraction from Wikipedia in a simple yet highly effective approach. In addition, we introduce a fuzzy logic model with a strong focus on computational efficiency. We also present a new measure, decisive in both methods for the entity linking selection and the quantification of the confidence of the produced entity links, namely the relative commonness measure. The experimental results of our approach on established datasets revealed state-of-the-art accuracy and run-time performance in the domain of fast, context-free Wikification, by relying on an offline pre-processing stage on the corpus of Wikipedia. The methods introduced can be leveraged as stand-alone NED methodologies, propitious for applications on mobile devices, or in the context of vastly reducing the complexity of deep neural network approaches as a first context-free layer.
ISSN:	2078-2489 2078-2489
DOI:	10.3390/info13080367