A comprehensive evaluation of multiclass imbalance techniques with ensemble models in IoT environments

The internet of things (IoT) has revolutionized connectivity and introduced significant security challenges. In this context, intrusion detection systems (IDS) play a crucial role in detecting attacks in loT environments. Bot-IoT datasets often face class imbalance issues, with the attack class havi...

Full description

Saved in:
Bibliographic Details
Published inTelkomnika Vol. 22; no. 3; pp. 690 - 701
Main Authors Amien, Januar Al, Ab Ghani, Hadhrami, Izrin Md Saleh, Nurul, Soni, Soni, Fatma, Yulia, Hayami, Regiolina
Format Journal Article
LanguageEnglish
Published Yogyakarta Ahmad Dahlan University 01.06.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The internet of things (IoT) has revolutionized connectivity and introduced significant security challenges. In this context, intrusion detection systems (IDS) play a crucial role in detecting attacks in loT environments. Bot-IoT datasets often face class imbalance issues, with the attack class having significantly more samples than the normal class. Addressing this imbalance is essential to enhance IDS performance. The study evaluates various techniques, including imbalance ratio techniques we call imbalance ratio formula (IRF) for controlling imbalance data, while also testing IRE to compare it with oversampling techniques like synthetic minority oversampling technique (SMOTE) and adaptive synthetic sampling (ADASYN). This research also incorporates the extreme gradient boosting (XGBoost) ensemble model approach to improve IDS performance in dealing with multiclass imbalance issues in Bot-IoT datasets. Through indepth analysis, we identify the strengths and weaknesses of each method. This study aims to guide researchers and practitioners working on IDS in high-risk loT environments. The proposed IRF, when integrated with the XGBoost algorithm has been demonstrated to achieve comparable accuracy of 99.9993% while reducing the training time to be on average at least two times faster than those achieved by the other state-of-the-art ensemble methods.
ISSN:1693-6930
2302-9293
DOI:10.12928/telkomnika.v22i3.25887