A Combined Index for Mixed Structured and Unstructured Data
In big data epoch, one of the major challenges is the large volume of mixed structured and unstructured data, which comes in heterogeneous sources. Because of different form, structured and unstructured data are often considered apart from each other. However, they may speak about the same entities...
Saved in:
Published in | 2015 12th Web Information System and Application Conference (WISA) pp. 217 - 222 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.09.2015
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In big data epoch, one of the major challenges is the large volume of mixed structured and unstructured data, which comes in heterogeneous sources. Because of different form, structured and unstructured data are often considered apart from each other. However, they may speak about the same entities of the world. If a query involve both structured data and its unstructured counterpart, it is inefficient to execute it separately. The paper presents a novel index structure tailored towards the combinations of structured and unstructured data. The combined index is a joint index over structured database and unstructured document, based on entity co-occurrences. It is also a semantic index which describes the semantic relationships between entities and their multiple resources. We store the index as RDF graphs and queries are SPARQL-like. Experiments show that the associated index can not only provide apposite information but also execute queries efficiently. |
---|---|
ISBN: | 9781467393713 1467393711 |
DOI: | 10.1109/WISA.2015.36 |