Integrating Machine Learning Algorithms With Quantum Annealing Solvers for Online Fraud Detection

Machine learning has been increasingly applied in identification of fraudulent transactions. However, most application systems detect duplicitous activities after they have already occurred, not at or near real time. Since spurious transactions are far fewer than the normal ones, the highly imbalanc...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 10; pp. 75908 - 75917
Main Authors Wang, Haibo, Wang, Wendy, Liu, Yi, Alidaee, Bahram
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Machine learning has been increasingly applied in identification of fraudulent transactions. However, most application systems detect duplicitous activities after they have already occurred, not at or near real time. Since spurious transactions are far fewer than the normal ones, the highly imbalanced data makes fraud detection very challenging and calls for ways to address it beyond the traditional machine learning approach. This study has proposed a detection framework, and implemented it using quantum machine learning (QML) approach by applying Support Vector Machine (SVM) enhanced with quantum annealing solvers. To evaluate its detection performance, we have further implemented twelve machine learning methods, and compared the performance of QML application with these machine learning implementations on two datasets: Israel credit card transactions (non-time series) which is moderately imbalanced, and a bank loan dataset (time series) that is highly imbalanced. The result shows that, the quantum enhanced SVM has categorically outperformed the rest in both speed and accuracy with the bank loan dataset. However, its detection accuracy is similar to others with Israel credit card transactions data. Furthermore, for both datasets, feature selection has been shown to significantly improve the detection speed, although the improvement on accuracy is marginal. These findings have demonstrated the potential of QML applications on time series based, highly imbalanced data, and the merit of traditional machine learning approaches in non-time series data. This study provides insight on selecting appropriate approach with different types of datasets while taking into consideration the tradeoffs of speed, accuracy, and cost.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2022.3190897