A modified real-value negative selection detector-based oversampling approach for multiclass imbalance problems

A skewed distribution poses a major challenge in multiclass imbalanced problems and has attracted growing interest in the engineering and research communities. Conventional classifiers pose limitations in handling multiclass imbalanced data sets since they were originally designed to handle a balanc...

Full description

Saved in:
Bibliographic Details
Published inInformation sciences Vol. 556; pp. 160 - 176
Main Authors Liu, Ming, Dong, Minggang, Jing, Chao
Format Journal Article
LanguageEnglish
Published Elsevier Inc 01.05.2021
Subjects
Online AccessGet full text
ISSN0020-0255
1872-6291
DOI10.1016/j.ins.2020.12.058

Cover

More Information
Summary:A skewed distribution poses a major challenge in multiclass imbalanced problems and has attracted growing interest in the engineering and research communities. Conventional classifiers pose limitations in handling multiclass imbalanced data sets since they were originally designed to handle a balanced distribution. This paper proposes a modified real-value negative selection detector-based oversampling approach for the multiclass imbalance problem. Different from previous works, we have modified and introduced the traditional real-value negative selection algorithm to address the issue of the multiclass imbalance problem. First, the modified real-value negative selection technique is used to create detectors for each minority class. Then, a supervision mechanism is designed using these detectors to prevent overgeneralization. Furthermore, the method of selection weight based on crowding density and detectors is devised with the aim of reducing the within-class imbalance for each minority class. Finally, simulations were conducted on the public real-world multiclass imbalanced data sets from Knowledge Extraction based on Evolutionary Learning (KEEL) and the University of California Irvine (UCI). The experimental and statistical results have demonstrated that the proposed algorithm obtains better performance compared with seven well-known oversampling algorithms in terms of five metrics (precision, recall, F-measure, Multiclass G-mean (MG), and Multiclass Area Under the ROC (MAUC)) and four types of classifiers.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2020.12.058