Implementation of LightGBM and Random Forest in Potential Customer Classification

Classification is one of the data mining techniques that can be used to determine potential custumers. Previous research show that the boosting method, especially LGBM, produces the highest accuracy value of all models, namely 100%. Meanwhile, for the two bagging methods, Random Forest produced the...

Full description

Saved in:
Bibliographic Details
Published inTIERS Information Technology Journal Vol. 4; no. 1; pp. 43 - 55
Main Authors Sari, Laura, Romadloni, Annisa, Lityaningrum, Rostika, Hastuti, Hety Dwi
Format Journal Article
LanguageEnglish
Published 25.06.2023
Online AccessGet full text

Cover

Loading…
More Information
Summary:Classification is one of the data mining techniques that can be used to determine potential custumers. Previous research show that the boosting method, especially LGBM, produces the highest accuracy value of all models, namely 100%. Meanwhile, for the two bagging methods, Random Forest produced the highest accuracy compared to Extra Trees, namely 99.03%. The research uses the LGBM and Random Forest methods to classify potential customers. The results of this study indicate that in imbalance data the LightGBM method has better accuracy than the Random Forest, which is 85.49%, when the Random Forest is unable to produce a model. The SMOTE method used in this study affects the accuracy of the random forest but does not affect the accuracy of LightGBM. Over all the Accuracy, Recall, Specificity, and Precision values, Random Forest produces a good value compared to LightGBM on balanced data. Meanwhile, LightGBM is able to handle unbalanced data.
ISSN:2723-4533
2723-4541
DOI:10.38043/tiers.v4i1.4355