Cost-sensitive sparse subset selection

Selecting a few important representatives that could reveal the intrinsic structure of a data set with massive data samples, i.e., subset selection, is very useful for different applications in machine learning and information retrieval domains. In this paper, we propose a cost-sensitive sparse regr...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of machine learning and cybernetics Vol. 15; no. 4; pp. 1503 - 1515
Main Authors Wei, Lai, Liu, Shiteng
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer Berlin Heidelberg 01.04.2024
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Selecting a few important representatives that could reveal the intrinsic structure of a data set with massive data samples, i.e., subset selection, is very useful for different applications in machine learning and information retrieval domains. In this paper, we propose a cost-sensitive sparse regression-based subset selection method, termed cost-sensitive sparse subset selection (CS4). CS4 considers the cost of different subsets for the prediction of all the data samples in a given data set and can choose a subset that has minimal prediction cost. Hence, compared to the related sparse regression-based methods, CS4 is capable of selecting the most informative representatives to characterize the structures of data sets. Moreover, we present an optimization algorithm for solving CS4 problem. The convergence and computation complexity of the algorithm have been analyzed. The relationships between CS4 and the related algorithms have been also discussed. Finally, the experiments on representative selection and classification show the effectiveness and superiorities of CS4.
ISSN:1868-8071
1868-808X
DOI:10.1007/s13042-023-01979-3