Adaptive over-sampling method for classification with application to imbalanced datasets in aluminum electrolysis

The class imbalance problem often appears in practical applications, where one class has numerous instances and the other has only a few instances. Synthetic Minority Over-sampling TEchnique (SMOTE) is the most popular and commonly used sampling method to solve this problem. It has two important par...

Full description

Saved in:
Bibliographic Details
Published inNeural computing & applications Vol. 32; no. 11; pp. 7183 - 7199
Main Authors Huang, Zhaoke, Yang, Chunhua, Chen, Xiaofang, Huang, Keke, Xie, Yongfang
Format Journal Article
LanguageEnglish
Published London Springer London 01.06.2020
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The class imbalance problem often appears in practical applications, where one class has numerous instances and the other has only a few instances. Synthetic Minority Over-sampling TEchnique (SMOTE) is the most popular and commonly used sampling method to solve this problem. It has two important parameters: over-sampling rate N and number of nearest neighbors k . However, the two parameters that are arbitrarily chosen by users are not optimal in practical applications. In addition, the imbalance ratios of these datasets are absolutely different, which makes parameter selection in SMOTE more difficult. To overcome the problem, an adaptive over-sampling method is proposed in this study based on SMOTE. It transforms the parameter selection problem in SMOTE to a multi-objective optimization problem. Then, a new selection strategy named absolute dominance-based selection is proposed to obtain the current optimal solution. Finally, the state transition algorithm is used to search the best parameter values of SMOTE to achieve the optimal objectives. Four imbalanced benchmark datasets and four class-imbalanced aluminum electrolysis datasets are used to verify the validity of the proposed method. In comparison with other methods, the proposed method has the advantage of good classification performance. Numerical results also show that the proposed method can successfully solve the class imbalance problem in aluminum electrolysis.
ISSN:0941-0643
1433-3058
DOI:10.1007/s00521-019-04208-7