A Combined Index for Mixed Structured and Unstructured Data

In big data epoch, one of the major challenges is the large volume of mixed structured and unstructured data, which comes in heterogeneous sources. Because of different form, structured and unstructured data are often considered apart from each other. However, they may speak about the same entities...

Full description

Saved in:

Bibliographic Details
Published in	2015 12th Web Information System and Application Conference (WISA) pp. 217 - 222
Main Authors	Chunying Zhu, Qingzhong Li, Lanju Kong, Song Wei
Format	Conference Proceeding
Language	English
Published	IEEE 01.09.2015
Subjects	Aspirin Diabetes Indexes Ontologies Resource description framework semantic index Semantics structured data unstructured data
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In big data epoch, one of the major challenges is the large volume of mixed structured and unstructured data, which comes in heterogeneous sources. Because of different form, structured and unstructured data are often considered apart from each other. However, they may speak about the same entities of the world. If a query involve both structured data and its unstructured counterpart, it is inefficient to execute it separately. The paper presents a novel index structure tailored towards the combinations of structured and unstructured data. The combined index is a joint index over structured database and unstructured document, based on entity co-occurrences. It is also a semantic index which describes the semantic relationships between entities and their multiple resources. We store the index as RDF graphs and queries are SPARQL-like. Experiments show that the associated index can not only provide apposite information but also execute queries efficiently.
ISBN:	9781467393713 1467393711
DOI:	10.1109/WISA.2015.36