基于MapReduce的top-k高效用模式挖掘算法

高效用模式挖掘被广泛地应用于数据挖掘领域。为了挖掘指定数量的高效用模式,一些基于树结构和效用表结构的top-k高效用挖掘算法被提出,但前者在挖掘过程中产生了大量候选模式,后者在效用模式增长时需要进行多次比较;同时,由于在信息社会,数据量呈爆炸性增长,所以在数据集过大的情况下,挖掘高效用模式需以大量存储空间以及计算开销为代价。为了解决这两个问题,基于MapReduce的top—k高效用模式挖掘算法(TKHUP_MaR)被提出。该算法通过两次扫描数据库,利用三次MapReduce来实现并行top—k高效用模式的挖掘。通过实验表明TKHUP_MaR算法在并行挖掘top—k高效用模式的过程中是有效的。...

Full description

Saved in:
Bibliographic Details
Published in计算机应用研究 Vol. 34; no. 10; pp. 2897 - 2900
Main Author 吴倩 王林平 罗相洲 崔建群 王海
Format Journal Article
LanguageChinese
Published 华中师范大学计算机学院,武汉,430079%华中师范大学科技处,武汉,430079 2017
Subjects
Online AccessGet full text
ISSN1001-3695
DOI10.3969/j.issn.1001-3695.2017.10.004

Cover

Abstract 高效用模式挖掘被广泛地应用于数据挖掘领域。为了挖掘指定数量的高效用模式,一些基于树结构和效用表结构的top-k高效用挖掘算法被提出,但前者在挖掘过程中产生了大量候选模式,后者在效用模式增长时需要进行多次比较;同时,由于在信息社会,数据量呈爆炸性增长,所以在数据集过大的情况下,挖掘高效用模式需以大量存储空间以及计算开销为代价。为了解决这两个问题,基于MapReduce的top—k高效用模式挖掘算法(TKHUP_MaR)被提出。该算法通过两次扫描数据库,利用三次MapReduce来实现并行top—k高效用模式的挖掘。通过实验表明TKHUP_MaR算法在并行挖掘top—k高效用模式的过程中是有效的。
AbstractList 高效用模式挖掘被广泛地应用于数据挖掘领域。为了挖掘指定数量的高效用模式,一些基于树结构和效用表结构的top-k高效用挖掘算法被提出,但前者在挖掘过程中产生了大量候选模式,后者在效用模式增长时需要进行多次比较;同时,由于在信息社会,数据量呈爆炸性增长,所以在数据集过大的情况下,挖掘高效用模式需以大量存储空间以及计算开销为代价。为了解决这两个问题,基于MapReduce的top—k高效用模式挖掘算法(TKHUP_MaR)被提出。该算法通过两次扫描数据库,利用三次MapReduce来实现并行top—k高效用模式的挖掘。通过实验表明TKHUP_MaR算法在并行挖掘top—k高效用模式的过程中是有效的。
TP301.6; 高效用模式挖掘被广泛地应用于数据挖掘领域.为了挖掘指定数量的高效用模式,一些基于树结构和效用表结构的top-k高效用挖掘算法被提出,但前者在挖掘过程中产生了大量候选模式,后者在效用模式增长时需要进行多次比较;同时,由于在信息社会,数据量呈爆炸性增长,所以在数据集过大的情况下,挖掘高效用模式需以大量存储空间以及计算开销为代价.为了解决这两个问题,基于MapReduce的top-k高效用模式挖掘算法(TKHUP_MaR)被提出.该算法通过两次扫描数据库,利用三次MapReduce来实现并行top-k高效用模式的挖掘.通过实验表明TKHUP_MaR算法在并行挖掘top-k高效用模式的过程中是有效的.
Abstract_FL High utility pattern mining has been widely applied in the field of data mining.Some top-k high utility pattern mining algorithms based on tree-like and list-like structures were proposed.However,tree-like algorithms generated a large number of candidates,and comparing operation was costly during the process of utility pattern growth in list-like algorithms.In addition,the amount of information data increased exponentially in information society.Thus,it required memory usage and computational cost in mining process,especially the dataset size was huge.In order to address above issues,this paper proposed top-k high utility pattern mining algorithm based on MapReduce,called TKHUP_MaR.TKHUP_MaR needed to scan database twice and used three MapReduce phases to parallelize top-k high utility pattern mining.The experiment results show that TKHUP_MaR is effective in the process of mining top-k high utility patterns on parallel environment.
Author 吴倩 王林平 罗相洲 崔建群 王海
AuthorAffiliation 华中师范大学计算机学院,武汉430079 华中师范大学科技处,武汉430079
AuthorAffiliation_xml – name: 华中师范大学计算机学院,武汉,430079%华中师范大学科技处,武汉,430079
Author_FL Wang Hai
Wu Qian
Cui Jianqun
Wang Linping
Luo Xiangzhou
Author_FL_xml – sequence: 1
  fullname: Wu Qian
– sequence: 2
  fullname: Wang Linping
– sequence: 3
  fullname: Luo Xiangzhou
– sequence: 4
  fullname: Cui Jianqun
– sequence: 5
  fullname: Wang Hai
Author_xml – sequence: 1
  fullname: 吴倩 王林平 罗相洲 崔建群 王海
BookMark eNo9j79Lw0AcxW-oYFv9J8TBJfG-d5dLbpTiL6gI0j2cyV1N1EtMLJLdwaHUxWpxEbeC4OJW_HNsiv-FJxWnx3t8eI_XQg2TGYXQJmCXCi62UzcpS-MCxuBQLjyXYPCtdTFmDdT8z1dRqyxTGxIQuInI_GX2NRsdyfxExYNILZ5vr7PcOf9-m9Tju8XDtJ6-zj_v6-FjPZos3p_qj_EaWtHyolTrf9pGvb3dXufA6R7vH3Z2uk7EMXNABEB4ENPIV0pFp4QKn4HQmgviSSkAS0wCiKUG8CXTApTglAjgkVZMBrSNtpa1N9Joafphmg0KYwfDtEyrqkp_D9pbmFl0Y4lGZ5npXyUWzovkUhZVyH0KjFDw6A-lzGHg
ClassificationCodes TP301.6
ContentType Journal Article
Copyright Copyright © Wanfang Data Co. Ltd. All Rights Reserved.
Copyright_xml – notice: Copyright © Wanfang Data Co. Ltd. All Rights Reserved.
DBID 2RA
92L
CQIGP
W92
~WA
2B.
4A8
92I
93N
PSX
TCJ
DOI 10.3969/j.issn.1001-3695.2017.10.004
DatabaseName 中文期刊服务平台
中文科技期刊数据库-CALIS站点
维普中文期刊数据库
中文科技期刊数据库-工程技术
中文科技期刊数据库- 镜像站点
Wanfang Data Journals - Hong Kong
WANFANG Data Centre
Wanfang Data Journals
万方数据期刊 - 香港版
China Online Journals (COJ)
China Online Journals (COJ)
DatabaseTitleList

DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
DocumentTitleAlternate Top-k high utility pattern mining algorithm based on MapReduce
DocumentTitle_FL Top-k high utility pattern mining algorithm based on MapReduce
EndPage 2900
ExternalDocumentID jsjyyyj201710004
673142315
GrantInformation_xml – fundername: 国家自然科学基金资助项目
  funderid: (61370108)
GroupedDBID -0Y
2B.
2C0
2RA
5XA
5XJ
92H
92I
92L
ACGFS
ALMA_UNASSIGNED_HOLDINGS
CCEZO
CQIGP
CUBFJ
CW9
TCJ
TGT
U1G
U5S
W92
~WA
4A8
93N
ABJNI
PSX
ID FETCH-LOGICAL-c604-1981268d3c7eeecb2397419ff6925aa910a0281daf117a4f91e9632916cfe4a83
ISSN 1001-3695
IngestDate Thu May 29 03:54:51 EDT 2025
Wed Feb 14 09:57:30 EST 2024
IsPeerReviewed false
IsScholarly true
Issue 10
Keywords 高效用模式
top-k
data mining
parallel algorithm
并行算法
high utility pattern
数据挖掘
MapReduce
Language Chinese
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c604-1981268d3c7eeecb2397419ff6925aa910a0281daf117a4f91e9632916cfe4a83
Notes 51-1196/TP
High utility pattern mining has been widely applied in the field of data mining. Some top-k high utility pattern mining algorithms based on tree-like and list-like structures were proposed. However, tree-like algorithms generated a large number of candidates, and comparing operation was costly during the process of utility pattern growth in list-like algorithms. In addition, the amount of information data increased exponentially in information society. Thus, it required memory usage and computational cost in mining process, especially the dataset size was huge. In order to address above issues, this paper proposed top-k high utility pattern mining algorithm based on MapReduee, called TKHUP_MaR. TKHUP_MaR needed to scan database twice and used three MapReduce phases to parallelize top-k high utility pattern mining. The experiment results show that TKHUP_MaR is effective in the process of mining top-k high utility patterns on parallel environment.
data mining; top-k; high utility pattern; MapReduce; pa
PageCount 4
ParticipantIDs wanfang_journals_jsjyyyj201710004
chongqing_primary_673142315
PublicationCentury 2000
PublicationDate 2017
PublicationDateYYYYMMDD 2017-01-01
PublicationDate_xml – year: 2017
  text: 2017
PublicationDecade 2010
PublicationTitle 计算机应用研究
PublicationTitleAlternate Application Research of Computers
PublicationTitle_FL Application Research of Computers
PublicationYear 2017
Publisher 华中师范大学计算机学院,武汉,430079%华中师范大学科技处,武汉,430079
Publisher_xml – name: 华中师范大学计算机学院,武汉,430079%华中师范大学科技处,武汉,430079
SSID ssj0042190
ssib001102940
ssib002263599
ssib023646305
ssib051375744
ssib025702191
Score 2.0608506
Snippet ...
TP301.6;...
SourceID wanfang
chongqing
SourceType Aggregation Database
Publisher
StartPage 2897
SubjectTerms MapReduce
top-k
并行算法
数据挖掘
高效用模式
Title 基于MapReduce的top-k高效用模式挖掘算法
URI http://lib.cqvip.com/qk/93231X/201710/673142315.html
https://d.wanfangdata.com.cn/periodical/jsjyyyj201710004
Volume 34
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Pb9MwFLeqTkJc-I8YA1QkfJpaGsdO7GPcpZqQ4ICKtFvlZMmmIrWFdYfuzIEDGhcGExfEbRISF24TJz4La8W34D3HS4NAE3CxrGfn-TnP8fvZeX4m5J5nACPn-H1LIZo8ZVkTL35rcpYD4BWJHxZevo-C9Sf8wYbYqNW-VbyWdidJK93747mS_9Eq0ECveEr2HzRbMgUC5EG_kIKGIf0rHdNYUNWlOqIxx1TGD834MQZjzWgcUgUUPhmNm09prGikqZI0DqgSVEpbzmlkKZBGHjLTHSq7SJEdqgKbie1TIY0gEyJF-7S4sPIU09JYYilyqFRTHSuXwBQaWjQHddqOEimqy51B25k21Rwzsg2Fq1gHBJDaMiw4A0N4yrdles2SgLWm2vYEntZs1dbithHIaCsI1AYZ-a88NRSH1Y2P4oSnm6XRD8wPXGfdNO72RN1wbVcnZVm4ADsDz5SNjfqb8fBVoKzxwDZaZRvo_he2rAcgXxjN0pUxCH0PICnGOVhiYeiJOlmK9JruLkApYLhqkEKG8X8Wi0CM4B9UZl28VhDMSDnrCs8Phb2joMAXHAqLGBtOwHPwxRXS3z9Ldgwesj0abj0DSGRPqA1zM9yqgKneJXLBrYIaUTGkL5Pa3vYVcvH0hpGGMzhXCTv5cPz9eL8c1fP3L-x4_vHpcHbwcv7maHb08eTr69mrt7P9w_nnd7MvB9dIrxv3OutNd8tHMw0wZqYCiBnITT8NsyxLEwYAmXsqzwPFhDGAZg1AYG_T5J4XGp4rLwObwWBVk-YZN9K_TurD0TC7QRqCSRMEhrUlep9meWJ8P8HD03kqpOHeMlkp30B_XARz6Zf6WyZ33Tvpu098pz_YGUyn0wG-RfwPxm-eyWGFnMeaxQbdLVKfPN_NbgNknSR33Jj4CR9jbZk
linkProvider EBSCOhost
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=%E5%9F%BA%E4%BA%8EMapReduce%E7%9A%84top-k%E9%AB%98%E6%95%88%E7%94%A8%E6%A8%A1%E5%BC%8F%E6%8C%96%E6%8E%98%E7%AE%97%E6%B3%95&rft.jtitle=%E8%AE%A1%E7%AE%97%E6%9C%BA%E5%BA%94%E7%94%A8%E7%A0%94%E7%A9%B6&rft.au=%E5%90%B4%E5%80%A9+%E7%8E%8B%E6%9E%97%E5%B9%B3+%E7%BD%97%E7%9B%B8%E6%B4%B2+%E5%B4%94%E5%BB%BA%E7%BE%A4+%E7%8E%8B%E6%B5%B7&rft.date=2017&rft.issn=1001-3695&rft.volume=34&rft.issue=10&rft.spage=2897&rft.epage=2900&rft_id=info:doi/10.3969%2Fj.issn.1001-3695.2017.10.004&rft.externalDocID=673142315
thumbnail_s http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=http%3A%2F%2Fimage.cqvip.com%2Fvip1000%2Fqk%2F93231X%2F93231X.jpg
http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=http%3A%2F%2Fwww.wanfangdata.com.cn%2Fimages%2FPeriodicalImages%2Fjsjyyyj%2Fjsjyyyj.jpg