Synthetic data generation for machine learning model training for energy theft scenarios using cosimulation

Technical and non‐technical losses in distribution circuits result in significant economic costs to power utilities. One type of non‐technical loss is energy theft by various means including illegal tapping of feeders, bypassing the meter, and billing fraud. These losses are usually hard to detect,...

Full description

Saved in:
Bibliographic Details
Published inIET generation, transmission & distribution Vol. 17; no. 5; pp. 1035 - 1046
Main Authors Narayanan, Anantha, Hardy, Trevor
Format Journal Article
LanguageEnglish
Published Wiley 01.03.2023
Online AccessGet full text

Cover

Loading…
More Information
Summary:Technical and non‐technical losses in distribution circuits result in significant economic costs to power utilities. One type of non‐technical loss is energy theft by various means including illegal tapping of feeders, bypassing the meter, and billing fraud. These losses are usually hard to detect, and can remain undetected for long periods of time. Machine learning models have been proven effective in detecting these conditions, but rely on the availability of large, good‐quality training data sets. The problem is exacerbated by the imbalanced nature of data related to these conditions—energy theft, though costly, is very rare. The available data sets generally have very few samples of theft with most of the data pertaining to normal operation. Such data sets are generally not suitable to train machine learning models. In this paper, an overview of energy theft detection techniques, the challenges with their data needs, and the limitations of current techniques to bridge such data limitations is presented. A co‐simulation framework is proposed to generate reliable training data for machine learning algorithms for theft detection. An example scenario is presented and a machine learning model is built to detect certain kinds of energy theft.
ISSN:1751-8687
1751-8695
DOI:10.1049/gtd2.12619