Lake Data Warehouse Architecture for Big Data Solutions

Traditional Data Warehouse is a multidimensional repository. It is nonvolatile, ‎subject-oriented, integrated, time-variant, and non-‎operational data. It is gathered from multiple ‎heterogeneous data ‎sources. We need to adapt traditional Data Warehouse architecture to deal with the new ‎challenges...

Full description

Saved in:

Bibliographic Details
Published in	International journal of advanced computer science & applications Vol. 11; no. 8
Main Authors	Saddad, Emad, El-Bastawissy, Ali, M., Hoda, Hazman, Maryam
Format	Journal Article
Language	English
Published	West Yorkshire Science and Information (SAI) Organization Limited 2020
Subjects	Agricultural research Availability Big Data Computer architecture Computer science Data acquisition Data analysis Data warehouses Decision making Open source software Relational data bases Renewable resources Validity Visualization Volatility
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Traditional Data Warehouse is a multidimensional repository. It is nonvolatile, ‎subject-oriented, integrated, time-variant, and non-‎operational data. It is gathered from multiple ‎heterogeneous data ‎sources. We need to adapt traditional Data Warehouse architecture to deal with the new ‎challenges imposed by the abundance of data and the current big data characteristics, containing ‎volume, value, variety, validity, volatility, visualization, variability, and venue. The new ‎architecture also needs to handle existing drawbacks, including availability, scalability, and ‎consequently query performance. This paper introduces a novel Data Warehouse architecture, named Lake ‎Data Warehouse Architecture, to provide the traditional Data Warehouse with the capabilities to ‎overcome the challenges. ‎Lake Data Warehouse Architecture depends on merging the traditional Data Warehouse architecture ‎with big data technologies, like the Hadoop framework and Apache Spark. It provides a hybrid ‎solution in a complementary way. The main advantage of the proposed architecture is that it ‎integrates the current features in ‎traditional Data Warehouses and big data features acquired ‎through integrating the ‎traditional Data Warehouse with Hadoop and Spark ecosystems. Furthermore, it is ‎tailored to handle a tremendous ‎volume of data while maintaining availability, reliability, and ‎scalability.‎
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2158-107X 2156-5570
DOI:	10.14569/IJACSA.2020.0110854