Data stream treatment using sliding windows with MapReduce

Knowledge Discovery in Databases (KDD) techniques present limitations when the volume of data to process is very large. Any KDD algorithm needs to do several iterations on the complete set of data in order to carry out its work. For continuous data stream processing it is necessary to store part of...

Full description

Saved in:
Bibliographic Details
Published inJournal of Computer Science & Technology Vol. 16; no. 2; pp. 76 - 83
Main Authors Basgall, Maria Jose, Hasperue, Waldo, Naiouf, Marcelo
Format Journal Article
LanguageEnglish
Spanish
Published La Plata Graduate Network of Argentine Universities with Computer Science Schools (RedUNCI) 01.11.2016
Universidad Nacional de la Plata, Journal of Computer Science and Technology
Postgraduate Office, School of Computer Science, Universidad Nacional de La Plata
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Knowledge Discovery in Databases (KDD) techniques present limitations when the volume of data to process is very large. Any KDD algorithm needs to do several iterations on the complete set of data in order to carry out its work. For continuous data stream processing it is necessary to store part of it in a temporal window. In this paper, we present a technique that uses the size of the temporal window in a dynamic way, based on the frequency of the data arrival and the response time of the KDD task. The obtained results show that this technique reaches a great size window where each example of the stream is used in more than one iteration of the KDD task. Keywords: Big Data, MapReduce, Stream Processing.
ISSN:1666-6046
1666-6038