The Parallel Improved Apriori Algorithm Research Based on Spark
Apriori algorithm is one of the classical algorithm in the association rule mining field, this paper analyzes the shortcomings of classical Apriori algorithm, then improves it by constructing a new data structure and optimizing the prepruning step. Based on the improved Apriori algorithm and combine...
Saved in:
Published in | International Conference on Frontier of Computer Science and Technology (Print) pp. 354 - 359 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.08.2015
|
Subjects | |
Online Access | Get full text |
ISSN | 2159-6301 |
DOI | 10.1109/FCST.2015.28 |
Cover
Loading…
Summary: | Apriori algorithm is one of the classical algorithm in the association rule mining field, this paper analyzes the shortcomings of classical Apriori algorithm, then improves it by constructing a new data structure and optimizing the prepruning step. Based on the improved Apriori algorithm and combined with the Spark support for fine-grained data processing, we elaborate the idea of the improved Apriori algorithm's parallel processing, and propose the SIAP algorithms. We experimented by comparing with the Apriori algorithms based on Hadoop and the Apriori algorithms based on Spark, and the results show that the SIAP algorithm has a higher efficiency. |
---|---|
ISSN: | 2159-6301 |
DOI: | 10.1109/FCST.2015.28 |