Efficient, semantics-rich transformation and integration of large datasets
•This paper presents a new algorithm for the generation of RDF datasets.•Its scalability is guaranteed by high performance computing techniques.•The performance of the algorithm is tested in three different use cases.•Experiments show that its performance is better than the one of related tools. The...
Saved in:
Published in | Expert systems with applications Vol. 133; pp. 198 - 214 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
New York
Elsevier Ltd
01.11.2019
Elsevier BV |
Subjects | |
Online Access | Get full text |
ISSN | 0957-4174 1873-6793 |
DOI | 10.1016/j.eswa.2019.05.010 |
Cover
Loading…
Summary: | •This paper presents a new algorithm for the generation of RDF datasets.•Its scalability is guaranteed by high performance computing techniques.•The performance of the algorithm is tested in three different use cases.•Experiments show that its performance is better than the one of related tools.
The digital age is making more datasets available through the Internet, but their interoperability is still limited. The Semantic Web should play a fundamental role in achieving interoperable datasets. The semantic exploitation of data requires its efficient transformation into semantic formats and the integration of heterogeneous sources. Either the scalability of the existing tools for the semantic transformation of large volumes of data is limited or these tools do not provide a semantics-rich representation of the data.
The goal of this work was to show how scalable semantic data transformation processes can be designed and implemented, thereby addressing the first limitation mentioned above. Here, we propose an application of high-performance computing techniques to overcome the scalability limitation. The proposed method was implemented as an upgrade of our Semantic Web Integration Tool (SWIT). Additional improvements for supporting the transformation process in SWIT are also described in this paper. We evaluated the new method by using three case studies from the areas of bioinformatics, movies and persons. The results showed a significant speed-up with respect to the original SWIT algorithm and the related tools. The lessons learnt in our work allowed us to configure semantic transformation processes efficiently. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2019.05.010 |