Sparse feature selection and rare value prediction in imbalanced regression

Feature selection addresses the dimensionality-reduction problem by determining a subset of available features, which facilitates the construction of effective prediction models. However, with regression tasks where the target variable is continuous and imbalanced, ordinary feature-selection techniq...

Full description

Saved in:
Bibliographic Details
Published inInformation sciences Vol. 680; p. 121145
Main Authors Guan, Ying, Fu, Guang-Hui
Format Journal Article
LanguageEnglish
Published Elsevier Inc 01.10.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Feature selection addresses the dimensionality-reduction problem by determining a subset of available features, which facilitates the construction of effective prediction models. However, with regression tasks where the target variable is continuous and imbalanced, ordinary feature-selection techniques cannot be simply used without adjustment. This paper proposes SerEnet, a novel method of sparse feature selection in imbalanced regression, which explores both feature selection and estimation simultaneously by minimizing the Squared Error-Relevance with respect to a cutoff t (SERt), subject to a sparse penalty. Specifically, SERt considers the performance of target variables with relevance greater than t in the target-variable domain and emphasizes the error of rare values. Moreover, SerEnet can effectively identify features that contribute significantly to rare cases, thereby reducing the dominant influence of common instances on feature selection, and improving prediction performance for both rare values and overall data. Experimental results on simulated and real datasets show that SerEnet outperformed several algorithms in terms of prediction performance on continuous-imbalanced data.
ISSN:0020-0255
DOI:10.1016/j.ins.2024.121145