A comparative study of Reduced Error Pruning method in decision tree algorithms

Decision tree is one of the most popular and efficient technique in data mining. This technique has been established and well-explored by many researchers. However, some decision tree algorithms may produce a large structure of tree size and it is difficult to understand. Furthermore, misclassificat...

Full description

Saved in:
Bibliographic Details
Published in2012 IEEE International Conference on Control System, Computing and Engineering pp. 392 - 397
Main Authors Mohamed, W. Nor Haizan W., Salleh, Mohd Najib Mohd, Omar, Abdul Halim
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.11.2012
Subjects
Online AccessGet full text
ISBN9781467331425
1467331422
DOI10.1109/ICCSCE.2012.6487177

Cover

Abstract Decision tree is one of the most popular and efficient technique in data mining. This technique has been established and well-explored by many researchers. However, some decision tree algorithms may produce a large structure of tree size and it is difficult to understand. Furthermore, misclassification of data often occurs in learning process. Therefore, a decision tree algorithm that can produce a simple tree structure with high accuracy in term of classification rate is a need to work with huge volume of data. Pruning methods have been introduced to reduce the complexity of tree structure without decrease the accuracy of classification. One of pruning methods is the Reduced Error Pruning (REP). To better understand pruning methods, an experiment was conducted using Weka application to compare the performance in term of complexity of tree structure and accuracy of classification for J 48, REPTree, PART, JRip, and Ridor algorithms using seven standard datasets from UCI machine learning repository. In data modeling, J48 and REPTree generate tree structure as an output while PART, Ridor and JRip generate rules. In additional J48, REPTree and PART using REP method for pruning while Ridor and JRip using improvement of REP method, namely IREP and RIPPER methods. The experiment result shown performance of J48 and REPTree are competitive in producing better result. Between J48 and REPTree, average differences performance of accuracy of classification is 7.1006% and 6.2857% for complexity of tree structure. For classification rules algorithms, Ridor is the best algorithms compare to PART and JRip due to highest percentage of accuracy of classification in five dataset from seven datasets. An algorithm that produces high accuracy with simple tree structure or simple rules can be awarded as the best algorithm in decision tree.
AbstractList Decision tree is one of the most popular and efficient technique in data mining. This technique has been established and well-explored by many researchers. However, some decision tree algorithms may produce a large structure of tree size and it is difficult to understand. Furthermore, misclassification of data often occurs in learning process. Therefore, a decision tree algorithm that can produce a simple tree structure with high accuracy in term of classification rate is a need to work with huge volume of data. Pruning methods have been introduced to reduce the complexity of tree structure without decrease the accuracy of classification. One of pruning methods is the Reduced Error Pruning (REP). To better understand pruning methods, an experiment was conducted using Weka application to compare the performance in term of complexity of tree structure and accuracy of classification for J 48, REPTree, PART, JRip, and Ridor algorithms using seven standard datasets from UCI machine learning repository. In data modeling, J48 and REPTree generate tree structure as an output while PART, Ridor and JRip generate rules. In additional J48, REPTree and PART using REP method for pruning while Ridor and JRip using improvement of REP method, namely IREP and RIPPER methods. The experiment result shown performance of J48 and REPTree are competitive in producing better result. Between J48 and REPTree, average differences performance of accuracy of classification is 7.1006% and 6.2857% for complexity of tree structure. For classification rules algorithms, Ridor is the best algorithms compare to PART and JRip due to highest percentage of accuracy of classification in five dataset from seven datasets. An algorithm that produces high accuracy with simple tree structure or simple rules can be awarded as the best algorithm in decision tree.
Author Salleh, Mohd Najib Mohd
Omar, Abdul Halim
Mohamed, W. Nor Haizan W.
Author_xml – sequence: 1
  givenname: W. Nor Haizan W.
  surname: Mohamed
  fullname: Mohamed, W. Nor Haizan W.
  email: gi110016@siswa.uthm.edu.my
  organization: Faculty of Computer Science and Information Technology Universiti Tun Hussein Onn Malaysia Batu Pahat, Malaysia
– sequence: 2
  givenname: Mohd Najib Mohd
  surname: Salleh
  fullname: Salleh, Mohd Najib Mohd
  email: najib@uthm.edu.my
  organization: Faculty of Computer Science and Information Technology Universiti Tun Hussein Onn Malaysia Batu Pahat, Malaysia
– sequence: 3
  givenname: Abdul Halim
  surname: Omar
  fullname: Omar, Abdul Halim
  email: gi110017@siswa.uthm.edu.my
  organization: Faculty of Computer Science and Information Technology Universiti Tun Hussein Onn Malaysia Batu Pahat, Malaysia
BookMark eNo1kMtOwzAUBY0ACVryBd34BxL8iHOTZRUVqFSpiMe6sp3r1qiJK8dB6t-DRFkdzWJmcWbkZggDErLgrOCcNY_rtn1vV4VgXBRVWQMHuCIzXlYgJS95eU2yBup_FuqOZOP4xRj7tauaqXuyXVIb-pOOOvlvpGOaujMNjr5hN1ns6CrGEOlrnAY_7GmP6RA66gfaofWjDwNNEZHq4z5Enw79-EBunT6OmF12Tj6fVh_tS77ZPq_b5Sb3HFTKQYCCSmtlgIHQWEtjKzCqYtbUTiOXzGDHammZEs7wRlkHrjRSNEop5-ScLP66HhF3p-h7Hc-7ywfyB4TmUkw
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ICCSCE.2012.6487177
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 1467331414
1467331430
9781467331432
9781467331418
EndPage 397
ExternalDocumentID 6487177
Genre orig-research
GroupedDBID 6IE
6IF
6IK
6IL
6IN
AAJGR
AAWTH
ADFMO
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
IEGSK
IERZE
OCL
RIE
RIL
ID FETCH-LOGICAL-i175t-727576aa5b7072ae83bc67b560cb8fae130bed083c052fb195cf7f4b329555ff3
IEDL.DBID RIE
ISBN 9781467331425
1467331422
IngestDate Wed Sep 03 07:08:00 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i175t-727576aa5b7072ae83bc67b560cb8fae130bed083c052fb195cf7f4b329555ff3
PageCount 6
ParticipantIDs ieee_primary_6487177
PublicationCentury 2000
PublicationDate 2012-Nov.
PublicationDateYYYYMMDD 2012-11-01
PublicationDate_xml – month: 11
  year: 2012
  text: 2012-Nov.
PublicationDecade 2010
PublicationTitle 2012 IEEE International Conference on Control System, Computing and Engineering
PublicationTitleAbbrev ICCSCE
PublicationYear 2012
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0001106805
Score 1.9016745
Snippet Decision tree is one of the most popular and efficient technique in data mining. This technique has been established and well-explored by many researchers....
SourceID ieee
SourceType Publisher
StartPage 392
SubjectTerms Accuracy
Breast tissue
Classification algorithms
Complexity theory
Data mining
Data models
Decision tree
Decision trees
Iris
Machine learning algorithms
Prediction algorithms
Reduced Error Pruning
rules
Title A comparative study of Reduced Error Pruning method in decision tree algorithms
URI https://ieeexplore.ieee.org/document/6487177
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFA7bTp5UNvE3OXi0XZM2zXqUsjGFqaiD3UaTvuhQW6ntxb_el7bbUDx4axsaQhL4vry8932EXEQIqRqRwzGjVDoBA-UoxXEvay00toCoRVxnt-F0HtwsxKJDLje1MABQJ5-Bax_ru_w015UNlQ1DZNdMyi7p4jZrarW28RRmXSREXbsVWiPCgPO1pFP7LlrVIeZFw-s4fozHNrWLu223P_xVaniZ7JLZemBNVsmrW5XK1V-_NBv_O_I9MtgW8tH7DUTtkw5kfXJ3RfVW85vWArM0N_TBqrhCSsdFkRf4W2VDJrSxmKarjKatHQ-1F9k0eXvOi1X58v45IPPJ-CmeOq2vgrNCslA6SFnwlJEkQklP8gRGvtKhVMh9tBqZBBDWFKTIzbQnuFEsEtpIEyifR0IIY_wD0svyDA4JZcBk6ocGZJAEKhUR-EhBRWS4CVkSekekbydj-dFIZyzbeTj--_MJ2bEL0pT6nZJeWVRwhphfqvN6sb8BV5yn6Q
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT4MwGG3mPOhJzWb8bQ8ehdFC6TgasmXTbRrdkt0WWr7qooJBuPjX2wLbovHgjR-BNC3Je3z93nsIXQUaUqVGDkt1Y255BIQlBNXfspRM6jvAShPX8cQfzLzbOZs30PVaCwMAZfMZ2Oaw3MuPU1mYUlnH1-yacL6FtjXue6xSa20qKsTkSLBSveWbKEKP0pWpU33Oat8h4gSdYRg-hT3T3EXt-sU_ElZKgOnvofFqaFVfyatd5MKWX79cG_879n3U3kj58MMapA5QA5IWur_BcuP6jUuLWZwq_Gh8XCHGvSxLM_1YYYomuAqZxssEx3UgDzZb2Th6e06zZf7y_tlGs35vGg6sOlnBWmq6kFuatOj_jChigjucRtB1hfS50OxHiq6KQAObgFizM-kwqgQJmFRcecKlAWNMKfcQNZM0gSOECRAeu74C7kWeiFkAriahLFBU-STynWPUMpOx-KjMMxb1PJz8ffkS7Qym49FiNJzcnaJdsziV8O8MNfOsgHPNAHJxUS78N1erqzY
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2012+IEEE+International+Conference+on+Control+System%2C+Computing+and+Engineering&rft.atitle=A+comparative+study+of+Reduced+Error+Pruning+method+in+decision+tree+algorithms&rft.au=Mohamed%2C+W.+Nor+Haizan+W.&rft.au=Salleh%2C+Mohd+Najib+Mohd&rft.au=Omar%2C+Abdul+Halim&rft.date=2012-11-01&rft.pub=IEEE&rft.isbn=9781467331425&rft.spage=392&rft.epage=397&rft_id=info:doi/10.1109%2FICCSCE.2012.6487177&rft.externalDocID=6487177
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781467331425/lc.gif&client=summon&freeimage=true
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781467331425/mc.gif&client=summon&freeimage=true
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781467331425/sc.gif&client=summon&freeimage=true