An Efficient Resampling Technique for Financial Statements Fraud Detection: A Comparative Study
Financial statement fraud detection is the process of identifying falsified financial statements. Traditional auditing methods are time-consuming, expensive, and subject to error. Therefore, adopting an efficient and robust machine learning mechanism is important. Unfortunately, the current data sou...
Saved in:
Published in | 2023 3rd International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME) pp. 1 - 7 |
---|---|
Main Authors | , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
19.07.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Financial statement fraud detection is the process of identifying falsified financial statements. Traditional auditing methods are time-consuming, expensive, and subject to error. Therefore, adopting an efficient and robust machine learning mechanism is important. Unfortunately, the current data sources suffer from a severe class imbalance. The lack of sufficient fraudulent financial statement records inspires the use of various resampling techniques. This paper a) examines the efficiency of different resampling strategies to detect fraudulent financial statements while employing multi-layer feedforward neural networks, support vector machines, and naïve Bayes machine learning models, and b) investigates the superiority of using Raw Accounting Variables (RAVs) over financial ratios for financial statement fraud detection. A benchmark dataset of numerical financial variables (RAVs and financial ratios) is used as features for model evaluation. The fraud labels correspond to the Accounting and Auditing Enforcement Releases by the U.S. Securities and Exchange Commission (SEC). We analyze the performance of the models on 28 RAVs and 14 financial ratios suggested by accounting experts. Using the area under the receiver operating characteristic curve (AUC) as the performance metric, the synthetic minority oversampling technique (SMOTE), along with a three-layer feedforward neural network (AUC: 0.863), greatly outperformed the RUSBoost (AUC: 0.717) model. |
---|---|
DOI: | 10.1109/ICECCME57830.2023.10253185 |