A Novel Noise-Adapted Two-Layer Ensemble Model for Credit Scoring Based on Backflow Learning
Recently, the machine learning method and artificial intelligence algorithm have become increasingly important in classification problems, such as credit scoring. Building an ensemble learning model that has been proven to be typically more accurate and robust than individual classifiers, it is an i...
Saved in:
Published in | IEEE access Vol. 7; pp. 99217 - 99230 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Recently, the machine learning method and artificial intelligence algorithm have become increasingly important in classification problems, such as credit scoring. Building an ensemble learning model that has been proven to be typically more accurate and robust than individual classifiers, it is an important information management task of commercial banks and loan lenders. In this paper, a novel noise-adapted two-layer ensemble model for credit scoring based on backflow learning is proposed, in which five widely used base classifiers, i.e., extreme gradient boosting, gradient boosting decision tree, support vector machine, random forest, and linear discriminant analysis, are integrated. To amplify the strength and diversity of the base classifiers, a new backflow learning approach is proposed so that the base classifiers will relearn the misclassified data point. A final predictive result is obtained by fusing the prediction of all base classifiers through two-layer ensemble modeling. In addition, considering that noise data are a major problem that aggravates the accuracy of a predictive model, a new noise adaption approach based on the isolation forest algorithm is proposed to address noise data. It first calculates the outlier score of each data point to detect the noise data that are subsequently boosted in the training set to form the noise-adapted training set. Three credit datasets from the UCI machine learning repository are tested to compare the performance of the proposed model with those of other benchmark models. The experimental results prove that our proposed model outperforms other models by demonstrating satisfactory improvement in various performance measures. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2019.2930332 |