Enabling Joins over Cassandra NoSQL Databases
Over the last few years, we witness an explosion on the development of data management solutions for big data applications. To this direction, NoSQL databases provide new opportunities by enabling elastic scaling, fault tolerance, high availability and schema flexibility. Despite these benefits, the...
Saved in:
Published in | Big Data Innovations and Applications Vol. 1054; pp. 3 - 17 |
---|---|
Main Authors | , , , , |
Format | Book Chapter |
Language | English |
Published |
Switzerland
Springer International Publishing AG
2019
Springer International Publishing |
Series | Communications in Computer and Information Science |
Online Access | Get full text |
Cover
Loading…
Summary: | Over the last few years, we witness an explosion on the development of data management solutions for big data applications. To this direction, NoSQL databases provide new opportunities by enabling elastic scaling, fault tolerance, high availability and schema flexibility. Despite these benefits, their limitations in the flexibility of query mechanisms impose a real barrier for any application that has not predetermined access use-cases. One of the main reasons for this bottleneck is that NoSQL databases do not directly support joins. In this paper, we propose a data management solution, designed initially for eHealth environments, that relies on NoSQL Cassandra databases and efficiently supports joins, requiring no set-up time. More specifically, we present a query optimization and execution module, that can be placed, at runtime, on top of any Cassandra cluster, efficiently combining information from different column-families. Our optimizer rewrites input queries to queries for individual column-families and considers two join algorithms implemented for the efficient execution of the requested joins. Our evaluation demonstrates the feasibility of our solution and the advantages gained, compared to the only solution currently available by DataStax. To the best of our knowledge, our approach is the first and the only available open source solution allowing joins over NoSQL Cassandra databases. |
---|---|
ISBN: | 9783030273545 3030273547 |
ISSN: | 1865-0929 1865-0937 |
DOI: | 10.1007/978-3-030-27355-2_1 |