Generating and Validating Synthetic Training Data for Predicting Bankruptcy of Individual Businesses

In this study, we analyze the credit information (loan, delinquency information, etc.) of individual business owners to generatevoluminous training data to establish a bankruptcy prediction model through a partial synthetic training technique. Furthermore,we evaluate the prediction performance of th...

Full description

Saved in:
Bibliographic Details
Published inJournal of Information and Communication Convergence Engineering, 19(4) Vol. 19; no. 4; pp. 228 - 233
Main Authors Dong-Suk Hong, Cheol Baik
Format Journal Article
LanguageEnglish
Published 한국정보통신학회JICCE 2021
한국정보통신학회
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this study, we analyze the credit information (loan, delinquency information, etc.) of individual business owners to generatevoluminous training data to establish a bankruptcy prediction model through a partial synthetic training technique. Furthermore,we evaluate the prediction performance of the newly generated data compared to the actual data. When using conditional tabulargenerative adversarial networks (CTGAN)-based training data generated by the experimental results (a logistic regression task),the recall is improved by 1.75 times compared to that obtained using the actual data. The probability that both the actual andgenerated data are sampled over an identical distribution is verified to be much higher than 80%. Providing artificial intelligencetraining data through data synthesis in the fields of credit rating and default risk prediction of individual businesses, which havenot been relatively active in research, promotes further in-depth research efforts focused on utilizing such methods KCI Citation Count: 0
Bibliography:http://jiice.org
ISSN:2234-8255
2234-8883
DOI:10.6109/jicce.2021.19.4.228