Enhancing the DISSFCM Algorithm for Data Stream Classification

Analyzing data streams has become a new challenge to meet the demands of real time analytics. Conventional mining techniques are proving inefficient to cope with challenges associated with data streams, including resources constraints like memory and running time along with single scan of the data....

Full description

Saved in:

Bibliographic Details
Published in	Fuzzy Logic and Applications Vol. 11291; pp. 109 - 122
Main Authors	Casalino, Gabriella, Castellano, Giovanna, Fanelli, Anna Maria, Mencar, Corrado
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2019 Springer International Publishing
Series	Lecture Notes in Computer Science
Subjects	Data stream classification Incremental adaptive clustering Semi-supervised fuzzy clustering
Online Access	Get full text
ISBN	9783030125431 3030125432
ISSN	0302-9743 1611-3349
DOI	10.1007/978-3-030-12544-8_9

Cover

Loading…

More Information
Summary:	Analyzing data streams has become a new challenge to meet the demands of real time analytics. Conventional mining techniques are proving inefficient to cope with challenges associated with data streams, including resources constraints like memory and running time along with single scan of the data. Most existing data stream classification methods require labeled samples that are more difficult and expensive to obtain than unlabeled ones. Semi-supervised learning algorithms can solve this problem by using unlabeled samples together with a few labeled ones to build classification models. Recently we proposed DISSFCM, an algorithm for data stream classification based on incremental semi-supervised fuzzy clustering. To cope with the evolution of data, DISSFCM adapts dynamically the number of clusters by splitting large-scale clusters. While splitting is effective in improving the quality of clusters, a repeated application without counter-balance may induce many small-scale clusters. To solve this problem, in this paper we enhance DISSFCM by introducing a procedure that merges small-scale clusters. Preliminary experimental results on a real-world benchmark dataset show the effectiveness of the method.
ISBN:	9783030125431 3030125432
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-030-12544-8_9