Comparative analysis of classification algorithm evaluations to predict secondary school students’ achievement in core and elective subjects
Many researchers in educational data mining (EDM) have explored various machine learning techniques in order to predict students’ performance. However, the most daunting challenge in classification modelling is selecting the most effective algorithm with the highest accuracy. A study was conducted u...
Saved in:
Published in | International Journal of Advanced Technology and Engineering Exploration Vol. 9; no. 89; p. 430 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Bhopal
Accent Social and Welfare Society
30.04.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Many researchers in educational data mining (EDM) have explored various machine learning techniques in order to predict students’ performance. However, the most daunting challenge in classification modelling is selecting the most effective algorithm with the highest accuracy. A study was conducted using datasets from two Malaysian premier secondary schools, Maktab Rendah Sains Mara (MRSM) Kuala Berang and Kuala Terengganu. The purpose of this study is to respond to two key questions; the first is to examine which algorithm is the best in predicting secondary students’ achievement in core and elective subjects, while the second is to study whether the same features and algorithms are capable of predicting academic performance based on students’ first semester achievement. To do so, this study analysed the effectiveness of six different classification algorithms, which are naïve Bayes (NB), random forest (RF), k-nearest neighbour (kNN), support vector machine (SVM), sequential minimal optimization (SMO), and logistic regression (LGR). Each model’s prediction accuracy was evaluated using 10-fold cross validation in order to identify the best model. The results showed that the RF model outperformed other models in terms of accuracy, precision, recall, and F1-Measure. With most algorithms achieving significant accuracy levels for both core and elective subjects’ dataset. It is concluded that the prediction of secondary school students' achievement can begin as early as the first semester using RF for core and elective subjects with biology dataset. The accuracy obtained was 96.7% and 97.5%, respectively for the core and elective subjects. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 2394-5443 2394-7454 |
DOI: | 10.19101/IJATEE.2021.875311 |