Handling class Imbalance problem in Intrusion Detection System based on deep learning
Network intrusion detection system(NIDS) is the most used tool to detect malicious network activities. The NIDS has achieved in the recent years promising results for detecting known and novel attacks, with the adoption of deep learning. However, these NIDSs still have shortcomings. Most of the data...
Saved in:
Published in | International Journal of Networking and Computing Vol. 12; no. 2; pp. 467 - 492 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
IJNC Editorial Committee
2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Network intrusion detection system(NIDS) is the most used tool to detect malicious network activities. The NIDS has achieved in the recent years promising results for detecting known and novel attacks, with the adoption of deep learning. However, these NIDSs still have shortcomings. Most of the datasets used for NIDS are highly imbalanced, where the number of samples that belong to normal traffic is much larger than the attack traffic. The problem of imbalanced class skews the results. It limits the deep learning classifier’s performance for minority classes by misleading the classifier to be biased in favor of the majority class. To improve the detection rate for minority classes while ensuring efficiency, this study proposes a hybrid approach to handle the imbalance problem. This hybrid approach is a combination of oversampling with Synthetic Minority Over-Sampling (SMOTE) and Tomek link, an under-sampling method to reduce noise. Additionally, this study uses two deep learning models such as Long Short-Term Memory Network (LSTM) and Convolutional Neural Network (CNN) to provide a better intrusion detection system. The advantage of our proposed model is tested in NSL-KDD, CICIDS2017 datasets. In addition, we evaluate the method in the most recent intrusion detection dataset, CICIDS2018 dataset. We use 10-fold cross validation in this work to train the learning models and an independent test set for evaluation. The experimental results show that in the multi-class classification with NSLKDD dataset, the proposed model reached an overall accuracy and Fscore of 99% and 99.0.2% respectively on LSTM, an overall accuracy and Fscore of 99.70% and 99.27% respectively for CNN. And with CICIDS2017 an overall ac- curacy and Fscore of 99.65% and 98 % respectively on LSTM, an overall accuracy and Fscore of 99.85% and 98.98% respectively for CNN. In CICIDS2018 the proposed method achieved an overall detection rate and Fscore of 95% and 94% respectively. |
---|---|
ISSN: | 2185-2839 2185-2847 |
DOI: | 10.15803/ijnc.12.2_467 |