A comparison of random forest and logistic regression model in credit scoring of rural households

Many banks currently use the logistic regression model to do credit scoring to give loans to customers. This paper compares the random forest and logistic regression methods to support the financial analysis functions of the predictive tool for credit scoring. We use the data provided by the Vietnam...

Full description

Saved in:
Bibliographic Details
Published inIDEAS Working Paper Series from RePEc
Main Authors Hong Nhung Do, Simioni, Michel
Format Paper
LanguageEnglish
Published St. Louis Federal Reserve Bank of St. Louis 01.01.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Many banks currently use the logistic regression model to do credit scoring to give loans to customers. This paper compares the random forest and logistic regression methods to support the financial analysis functions of the predictive tool for credit scoring. We use the data provided by the Vietnam Access Resource to Household Survey (VARHS), which contains 3,530 households in the year 2014 in 12 provinces of Vietnam. Results show that random forest proved to be a better accurate predictive tool than the logistic regression method. This suggests banks use the random forest to predict potential lenders based on the existing client dataset resulting in saving time and cost to find potential clients.