Risk analysis and retrospective unbalanced data

This paper considers three different techniques applicable in the context of credit scoring when the event under study is rare and therefore we have to cope with unbalanced data. Logistic regression for matched case-control studies, logistic regression for a random balanced data sample and logistic...

Full description

Saved in:
Bibliographic Details
Published inRevstat Vol. 14; no. 2; p. 157
Main Authors Pierri, Francesca, Stanghellini, Elena, Bistoni, Nicolo
Format Journal Article
LanguageEnglish
Published Instituto Nacional de Estatistica 01.04.2016
Instituto Nacional de Estatística | Statistics Portugal
Subjects
Online AccessGet full text
ISSN1645-6726
2183-0371
DOI10.57805/revstat.v14i2.184

Cover

Loading…
More Information
Summary:This paper considers three different techniques applicable in the context of credit scoring when the event under study is rare and therefore we have to cope with unbalanced data. Logistic regression for matched case-control studies, logistic regression for a random balanced data sample and logistic regression for a sample balanced by ROSE (Random OverSampling Examples, Lunardon, Menardi and Torelli, 2014) are tested. We applied the methods to real data: balance sheets indicators of small and medium-sized enterprises and their legal status are considered. The event of interest is the opening of insolvency proceedings of bankruptcy. Key-Words: * bankruptcy; case-control studies; data augmentation; logistic regression; ROSE method; unbalanced data. AMS Subject Classification: * 62J05, 62M20, 62P20, 91G40.
ISSN:1645-6726
2183-0371
DOI:10.57805/revstat.v14i2.184