Cost-sensitive sparse subset selection
Selecting a few important representatives that could reveal the intrinsic structure of a data set with massive data samples, i.e., subset selection, is very useful for different applications in machine learning and information retrieval domains. In this paper, we propose a cost-sensitive sparse regr...
Saved in:
Published in | International journal of machine learning and cybernetics Vol. 15; no. 4; pp. 1503 - 1515 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Berlin/Heidelberg
Springer Berlin Heidelberg
01.04.2024
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Selecting a few important representatives that could reveal the intrinsic structure of a data set with massive data samples, i.e., subset selection, is very useful for different applications in machine learning and information retrieval domains. In this paper, we propose a cost-sensitive sparse regression-based subset selection method, termed cost-sensitive sparse subset selection (CS4). CS4 considers the cost of different subsets for the prediction of all the data samples in a given data set and can choose a subset that has minimal prediction cost. Hence, compared to the related sparse regression-based methods, CS4 is capable of selecting the most informative representatives to characterize the structures of data sets. Moreover, we present an optimization algorithm for solving CS4 problem. The convergence and computation complexity of the algorithm have been analyzed. The relationships between CS4 and the related algorithms have been also discussed. Finally, the experiments on representative selection and classification show the effectiveness and superiorities of CS4. |
---|---|
ISSN: | 1868-8071 1868-808X |
DOI: | 10.1007/s13042-023-01979-3 |