Fuzzy-Folded Bloom Filter-as-a-Service for Big Data Storage in the Cloud

With the ongoing trend of smart and Internet-connected objects being deployed across a broad range of applications, there is also a corresponding increase in the amount of data movement across different geographical regions. This, in turn, poses a number of challenges with respect to big data storag...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on industrial informatics Vol. 15; no. 4; pp. 2338 - 2348
Main Authors Singh, Amritpal, Garg, Sahil, Kaur, Kuljeet, Batra, Shalini, Kumar, Neeraj, Choo, Kim-Kwang Raymond
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 01.04.2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:With the ongoing trend of smart and Internet-connected objects being deployed across a broad range of applications, there is also a corresponding increase in the amount of data movement across different geographical regions. This, in turn, poses a number of challenges with respect to big data storage across multiple locations, including cloud computing platform. For example, the underlying distributed file system has a large number of directories and files in the form of gigantic trees, which are difficult to parse in polynomial time. Moreover, with the exponential increase of big data streams (i.e., unbounded sets of continuous data flows), challenges associated with indexing and membership queries are compounded. The capability to process such significant amount of data with high accuracy can have significant impact on decision-making and formulation of business and risk-related strategies, particularly in our current Industrial Internet of Things environment (IIoT). However, existing storage solutions are deterministic in nature. In other words, they tend to consume considerable memory and CPU time to yield accurate results. This necessitates the design of efficient quality of service-aware IIoT applications that are able to deal with the challenges of data storage and retrieval in the cloud computing environment. In this paper, we present an effective space-effective strategy for massive data storage using bloom filter (BF). Specifically, in the proposed scheme, the standard BF is extended to incorporate fuzzy-enabled folding approach, hereafter referred to as fuzzy folded BF (FFBF). In FFBF, fuzzy operations are used to accommodate the hashed data of one BF into another to reduce storage requirements. Evaluations on UCI ML AReM and Facebook datasets demonstrate the efficacy of FFBF, in terms of dealing with approximately 1.9 times more data as compared to using the standard BF. This is also achieved without affecting the false positive rate and query time.
ISSN:1551-3203
1941-0050
DOI:10.1109/TII.2018.2850053