Gradient Boosting Feature Selection With Machine Learning Classifiers for Intrusion Detection on Power Grids

Smart grids rely on SCADA (Supervisory Control and Data Acquisition) systems to monitor and control complex electrical networks in order to provide reliable energy to homes and industries. However, the increased inter-connectivity and remote accessibility of SCADA systems expose them to cyber attack...

Full description

Saved in:
Bibliographic Details
Published inIEEE eTransactions on network and service management Vol. 18; no. 1; pp. 1104 - 1116
Main Authors Upadhyay, Darshana, Manero, Jaume, Zaman, Marzia, Sampalli, Srinivas
Format Journal Article
LanguageEnglish
Published New York IEEE 01.03.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Smart grids rely on SCADA (Supervisory Control and Data Acquisition) systems to monitor and control complex electrical networks in order to provide reliable energy to homes and industries. However, the increased inter-connectivity and remote accessibility of SCADA systems expose them to cyber attacks. As a consequence, developing effective security mechanisms is a priority in order to protect the network from internal and external attacks. We propose an integrated framework for an Intrusion Detection System (IDS) for smart grids which combines feature engineering-based preprocessing with machine learning classifiers. Whilst most of the machine learning techniques fine-tune the hyper-parameters to improve the detection rate, our approach focuses on selecting the most promising features of the dataset using Gradient Boosting Feature Selection (GBFS) before applying the classification algorithm, a combination which improves not only the detection rate but also the execution speed. GBFS uses the Weighted Feature Importance (WFI) extraction technique to reduce the complexity of classifiers. We implement and evaluate various decision-tree based machine learning techniques after obtaining the most promising features of the power grid dataset through a GBFS module, and show that this approach optimizes the False Positive Rate (FPR) and the execution time.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1932-4537
1932-4537
DOI:10.1109/TNSM.2020.3032618