PLAR: Parallel Large-Scale Attribute Reduction on Cloud Systems

Attribute reduction for big data is viewed as an important preprocessing step in the areas of pattern recognition, machine learning and data mining. In this paper, a novel parallel method based on MapReduce for large-scale attribute reduction is proposed. By using this method, several representative...

Full description

Saved in:
Bibliographic Details
Published in2013 International Conference on Parallel and Distributed Computing, Applications and Technologies pp. 184 - 191
Main Authors Junbo Zhang, Tianrui Li, Yi Pan
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.12.2013
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Attribute reduction for big data is viewed as an important preprocessing step in the areas of pattern recognition, machine learning and data mining. In this paper, a novel parallel method based on MapReduce for large-scale attribute reduction is proposed. By using this method, several representative heuristic attribute reduction algorithms in rough set theory have been parallelized. Further, each of the improved parallel algorithms can select the same attribute reduct as its sequential version, therefore, owns the same classification accuracy. An extensive experimental evaluation shows that these parallel algorithms are effective for big data.
ISSN:2379-5352
DOI:10.1109/PDCAT.2013.36