Balancing coverage and specificity for semantic labelling of subject columns

Many data are published on the Web using tabular data formats (e.g., spreadsheets). One of the main challenges for their effective (re)use is their generalized lack of semantics (e.g., column names are not usually standardized, and their meaning and content are not always clear). There is a common u...

Full description

Saved in:
Bibliographic Details
Published inKnowledge-based systems Vol. 240; p. 108092
Main Authors Alobaid, Ahmad, Corcho, Oscar
Format Journal Article
LanguageEnglish
Published Amsterdam Elsevier B.V 15.03.2022
Elsevier Science Ltd
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Many data are published on the Web using tabular data formats (e.g., spreadsheets). One of the main challenges for their effective (re)use is their generalized lack of semantics (e.g., column names are not usually standardized, and their meaning and content are not always clear). There is a common understanding that the reuse of tabular data may be improved by annotating them with the types used in knowledge graphs. In this paper, we present a novel approach to automatically type entity columns in tabular data with ontology classes. In contrast with existing proposals in the state-of-the-art, our approach does not require external linguistic resources, lookup services, model training, building a model of the knowledge graph beforehand, or having a human in the loop.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0950-7051
1872-7409
DOI:10.1016/j.knosys.2021.108092