The Parallel Improved Apriori Algorithm Research Based on Spark
Apriori algorithm is one of the classical algorithm in the association rule mining field, this paper analyzes the shortcomings of classical Apriori algorithm, then improves it by constructing a new data structure and optimizing the prepruning step. Based on the improved Apriori algorithm and combine...
Saved in:
Published in | International Conference on Frontier of Computer Science and Technology (Print) pp. 354 - 359 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.08.2015
|
Subjects | |
Online Access | Get full text |
ISSN | 2159-6301 |
DOI | 10.1109/FCST.2015.28 |
Cover
Abstract | Apriori algorithm is one of the classical algorithm in the association rule mining field, this paper analyzes the shortcomings of classical Apriori algorithm, then improves it by constructing a new data structure and optimizing the prepruning step. Based on the improved Apriori algorithm and combined with the Spark support for fine-grained data processing, we elaborate the idea of the improved Apriori algorithm's parallel processing, and propose the SIAP algorithms. We experimented by comparing with the Apriori algorithms based on Hadoop and the Apriori algorithms based on Spark, and the results show that the SIAP algorithm has a higher efficiency. |
---|---|
AbstractList | Apriori algorithm is one of the classical algorithm in the association rule mining field, this paper analyzes the shortcomings of classical Apriori algorithm, then improves it by constructing a new data structure and optimizing the prepruning step. Based on the improved Apriori algorithm and combined with the Spark support for fine-grained data processing, we elaborate the idea of the improved Apriori algorithm's parallel processing, and propose the SIAP algorithms. We experimented by comparing with the Apriori algorithms based on Hadoop and the Apriori algorithms based on Spark, and the results show that the SIAP algorithm has a higher efficiency. |
Author | Zhou, Fachao Wang, Zhijian Xu, Guoyan Yang, Shaosong |
Author_xml | – sequence: 1 givenname: Shaosong surname: Yang fullname: Yang, Shaosong email: 489271346@qq.com organization: Coll. of Comput. & Inf., Hohai Univ., Nanjing, China – sequence: 2 givenname: Guoyan surname: Xu fullname: Xu, Guoyan email: gy_xu@126.com organization: Coll. of Comput. & Inf., Hohai Univ., Nanjing, China – sequence: 3 givenname: Zhijian surname: Wang fullname: Wang, Zhijian email: zhjwang@hhu.edu.cn organization: Coll. of Comput. & Inf., Hohai Univ., Nanjing, China – sequence: 4 givenname: Fachao surname: Zhou fullname: Zhou, Fachao email: 790428547@qq.com organization: Coll. of Comput. & Inf., Hohai Univ., Nanjing, China |
BookMark | eNotjs1KxDAURiOM4Dh2585NXqD13qRJ2pXUwdGBAcWp6yFJb22xf6SD4Ntb0NVZfIePc81WwzgQY7cICSLk97vtsUwEoEpEdsGi3GSYaiNzkSuxYmuBKo-1BLxi0Ty3DoQ2WkFq1uyhbIi_2WC7jjq-76cwflPFiym0Y2h50X0uODc9f6eZbPANf7TzIowDP042fN2wy9p2M0X_3LCP3VO5fYkPr8_7bXGIWwHZOdZobO6gIm9rJPIVoHcpogdtlAVla4faO4nCE7jKZN4pU8k6g2UQSHLD7v5-WyI6LXW9DT8nIzE1oOQvAeJLHA |
CODEN | IEEPAD |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/FCST.2015.28 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
EISBN | 9781467392952 1467392952 1467392944 9781467392945 |
EndPage | 359 |
ExternalDocumentID | 7314705 |
Genre | orig-research |
GroupedDBID | 6IE 6IF 6IH 6IK 6IL 6IM 6IN AAJGR AAWTH ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI OCL RIE RIL |
ID | FETCH-LOGICAL-i208t-617a9b0decaf1eecd01cb411c0675a05afb16cb312ce0bd78cb57d3f80fb121e3 |
IEDL.DBID | RIE |
ISSN | 2159-6301 |
IngestDate | Wed Aug 27 02:12:25 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i208t-617a9b0decaf1eecd01cb411c0675a05afb16cb312ce0bd78cb57d3f80fb121e3 |
PageCount | 6 |
ParticipantIDs | ieee_primary_7314705 |
PublicationCentury | 2000 |
PublicationDate | 20150801 |
PublicationDateYYYYMMDD | 2015-08-01 |
PublicationDate_xml | – month: 08 year: 2015 text: 20150801 day: 01 |
PublicationDecade | 2010 |
PublicationTitle | International Conference on Frontier of Computer Science and Technology (Print) |
PublicationTitleAbbrev | FCST |
PublicationYear | 2015 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssib026765047 ssj0003177943 |
Score | 1.6848164 |
Snippet | Apriori algorithm is one of the classical algorithm in the association rule mining field, this paper analyzes the shortcomings of classical Apriori algorithm,... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 354 |
SubjectTerms | Algorithm design and analysis Apriori association rule Clustering algorithms Data mining efficiency Heuristic algorithms Itemsets parallel Spark Sparks |
Title | The Parallel Improved Apriori Algorithm Research Based on Spark |
URI | https://ieeexplore.ieee.org/document/7314705 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwELVKJyZALeJbHhhJaufbEyoVVYVUVKmt1K2yzw5ULUkUpQu_nnPSFoQYmGIliy-55D07790Rch_6RkjlMwcHqRNAlDhKCHCEF6pI4PohDqx3ePwajebByyJctMjDwQtjjKnFZ8a1w_pfvs5ha7fKerHPg9gWLD3CNGu8Wvvc8aIYucaOGtuvMOJi3IjmENSEE2EiH3TvojccTGdW1xW6tg37j74qNawMT8h4P6FGTbJ2t5Vy4fNXrcb_zviUdL8NfHRygKYz0jJZhzxiStCJLG33lA1tdhOMpv2iXOXlivY3b3io3j_oXo1HnxDiNM0zOi1kue6S-fB5Nhg5u_4JzspjSWXdf1Iopg3IlBsDmnFQAedgVwmShTJVPALlcw8MUzpOQIWx9tOE4QWPG_-ctLM8MxeE6iQFAHx7QSokUMgiFMYqg9Qg4UlZckk6Nvhl0ZTIWO7ivvr79DU5tve-0dHdkHZVbs0tYnul7uqH-gW58aHX |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT8JAEN0QPehJDRi_3YNHW3b7td2TQSJBBUICJNzIflUJ2JKmXPz1zraAxnjw1E172clO-95u35tB6C70DRfSJw4MEidQUexIzpXDvVBGHPYPLLDe4f4g6k6Cl2k4raH7nRfGGFOKz4xrh-W_fJ2ptT0qazKfBswWLN0H3A_Cyq21zR4vYsA2NuTYfocBGVklmwNY404EqbxTvvNmpz0aW2VX6NpG7D86q5TA0jlC_e2UKj3Jwl0X0lWfv6o1_nfOx6jxbeHDwx04naCaSevoAZICD0Vu-6cscXWeYDRurfJ5ls9xa_kGl-L9A2_1ePgRQE7jLMWjlcgXDTTpPI3bXWfTQcGZeyQurP9PcEm0USKhxihNqJIBpcruEwQJRSJppKRPPWWI1CxWMmTaT2ICDzxq_FO0l2apOUNYx4lSCt5fJSRQKOAREmIVQWKA8iQkPkd1G_xsVRXJmG3ivvj79i066I77vVnvefB6iQ7tOlSquiu0V-Rrcw1IX8ibcoG_AGbOpSQ |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=International+Conference+on+Frontier+of+Computer+Science+and+Technology+%28Print%29&rft.atitle=The+Parallel+Improved+Apriori+Algorithm+Research+Based+on+Spark&rft.au=Yang%2C+Shaosong&rft.au=Xu%2C+Guoyan&rft.au=Wang%2C+Zhijian&rft.au=Zhou%2C+Fachao&rft.date=2015-08-01&rft.pub=IEEE&rft.issn=2159-6301&rft.spage=354&rft.epage=359&rft_id=info:doi/10.1109%2FFCST.2015.28&rft.externalDocID=7314705 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2159-6301&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2159-6301&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2159-6301&client=summon |