A Combined Index for Mixed Structured and Unstructured Data

In big data epoch, one of the major challenges is the large volume of mixed structured and unstructured data, which comes in heterogeneous sources. Because of different form, structured and unstructured data are often considered apart from each other. However, they may speak about the same entities...

Full description

Saved in:
Bibliographic Details
Published in2015 12th Web Information System and Application Conference (WISA) pp. 217 - 222
Main Authors Chunying Zhu, Qingzhong Li, Lanju Kong, Song Wei
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.09.2015
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In big data epoch, one of the major challenges is the large volume of mixed structured and unstructured data, which comes in heterogeneous sources. Because of different form, structured and unstructured data are often considered apart from each other. However, they may speak about the same entities of the world. If a query involve both structured data and its unstructured counterpart, it is inefficient to execute it separately. The paper presents a novel index structure tailored towards the combinations of structured and unstructured data. The combined index is a joint index over structured database and unstructured document, based on entity co-occurrences. It is also a semantic index which describes the semantic relationships between entities and their multiple resources. We store the index as RDF graphs and queries are SPARQL-like. Experiments show that the associated index can not only provide apposite information but also execute queries efficiently.
ISBN:9781467393713
1467393711
DOI:10.1109/WISA.2015.36