Redundancy in Linked Data Partitioning for Efficient Query Evaluation
The problem of efficient querying large amount of linked data using Map-Reduce is investigated in this paper. The proposed approach is based on the following assumptions: a) Data graphs are arbitrarily partitioned in the distributed file system is such a way that replication of data triples between...
Saved in:
Published in | 2015 3rd International Conference on Future Internet of Things and Cloud pp. 497 - 504 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.08.2015
|
Subjects | |
Online Access | Get full text |
DOI | 10.1109/FiCloud.2015.36 |
Cover
Summary: | The problem of efficient querying large amount of linked data using Map-Reduce is investigated in this paper. The proposed approach is based on the following assumptions: a) Data graphs are arbitrarily partitioned in the distributed file system is such a way that replication of data triples between the data segments is allowed. b) Data triples are replicated is such a way that answers to a special form of queries, called subject-object star queries, can be obtained from a single data segment. c) Each query posed by the user, can be transformed into a set of subject-object star sub queries. We propose a one and a half phase, scalable, Map-Reduce algorithm that efficiently computes the answers of the initial query by computing and appropriately combining the sub query answers. We prove that, under certain conditions, query can be answered in a single map-reduce phase. |
---|---|
DOI: | 10.1109/FiCloud.2015.36 |