Exploratory Undersampling for Class-Imbalance Learning

Undersampling is a popular method in dealing with class-imbalance problems, which uses only a subset of the majority class and thus is very efficient. The main deficiency is that many majority class examples are ignored. We propose two algorithms to overcome this deficiency. EasyEnsemble samples sev...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on systems, man and cybernetics. Part B, Cybernetics Vol. 39; no. 2; pp. 539 - 550
Main Authors Liu, Xu-Ying, Wu, Jianxin, Zhou, Zhi-Hua
Format Journal Article
LanguageEnglish
Published United States IEEE 01.04.2009
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Undersampling is a popular method in dealing with class-imbalance problems, which uses only a subset of the majority class and thus is very efficient. The main deficiency is that many majority class examples are ignored. We propose two algorithms to overcome this deficiency. EasyEnsemble samples several subsets from the majority class, trains a learner using each of them, and combines the outputs of those learners. BalanceCascade trains the learners sequentially, where in each step, the majority class examples that are correctly classified by the current trained learners are removed from further consideration. Experimental results show that both methods have higher Area Under the ROC Curve, F-measure, and G-mean values than many existing class-imbalance learning methods. Moreover, they have approximately the same training time as that of undersampling when the same number of weak classifiers is used, which is significantly faster than other methods.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1083-4419
1941-0492
1941-0492
DOI:10.1109/TSMCB.2008.2007853