Large-scale Semantic Integration of Linked Data A Survey

A large number of published datasets (or sources) that follow Linked Data principles is currently available and this number grows rapidly. However, the major target of Linked Data, i.e., linking and integration, is not easy to achieve. In general, information integration is difficult, because (a) da...

Full description

Saved in:
Bibliographic Details
Published inACM computing surveys Vol. 52; no. 5; pp. 1 - 40
Main Authors Mountantonakis, Michalis, Tzitzikas, Yannis
Format Journal Article
LanguageEnglish
Published Baltimore Association for Computing Machinery 30.09.2020
Subjects
Online AccessGet full text
ISSN0360-0300
1557-7341
DOI10.1145/3345551

Cover

Loading…
More Information
Summary:A large number of published datasets (or sources) that follow Linked Data principles is currently available and this number grows rapidly. However, the major target of Linked Data, i.e., linking and integration, is not easy to achieve. In general, information integration is difficult, because (a) datasets are produced, kept, or managed by different organizations using different models, schemas, or formats, (b) the same real-world entities or relationships are referred with different URIs or names and in different natural languages, (c) datasets usually contain complementary information, (d) datasets can contain data that are erroneous, out-of-date, or conflicting, (e) datasets even about the same domain may follow different conceptualizations of the domain, (f) everything can change (e.g., schemas, data) as time passes. This article surveys the work that has been done in the area of Linked Data integration, it identifies the main actors and use cases, it analyzes and factorizes the integration process according to various dimensions, and it discusses the methods that are used in each step. Emphasis is given on methods that can be used for integrating several datasets. Based on this analysis, the article concludes with directions that are worth further research.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0360-0300
1557-7341
DOI:10.1145/3345551