SWAT: A System for Detecting Salient Wikipedia Entities in Texts
Computational Intelligence, Wiley-Blackwell Publishing (2019) We study the problem of entity salience by proposing the design and implementation of SWAT, a system that identifies the salient Wikipedia entities occurring in an input document. SWAT consists of several modules that are able to detect a...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
10.04.2018
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Computational Intelligence, Wiley-Blackwell Publishing (2019) We study the problem of entity salience by proposing the design and
implementation of SWAT, a system that identifies the salient Wikipedia entities
occurring in an input document. SWAT consists of several modules that are able
to detect and classify on-the-fly Wikipedia entities as salient or not, based
on a large number of syntactic, semantic and latent features properly extracted
via a supervised process which has been trained over millions of examples drawn
from the New York Times corpus. The validation process is performed through a
large experimental assessment, eventually showing that SWAT improves known
solutions over all publicly available datasets. We release SWAT via an API that
we describe and comment in the paper in order to ease its use in other
software. |
---|---|
DOI: | 10.48550/arxiv.1804.03580 |