Support vector machine and optimal parameter selection for high-dimensional imbalanced data
In this article, we consider asymptotic properties of support vector machine (SVM) in high-dimension, low-sample-size (HDLSS) settings. In particular, we treat high-dimensional imbalanced data. We investigate behaviors of SVM for a regularization parameter C in a framework of kernel functions. We sh...
Saved in:
Published in | Communications in statistics. Simulation and computation Vol. 51; no. 11; pp. 6739 - 6754 |
---|---|
Main Author | |
Format | Journal Article |
Language | English |
Published |
Philadelphia
Taylor & Francis
02.11.2022
Taylor & Francis Ltd |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In this article, we consider asymptotic properties of support vector machine (SVM) in high-dimension, low-sample-size (HDLSS) settings. In particular, we treat high-dimensional imbalanced data. We investigate behaviors of SVM for a regularization parameter C in a framework of kernel functions. We show that SVM cannot handle imbalanced classification, and SVM is very biased in HDLSS settings. In order to overcome such difficulties, we propose robust SVM (RSVM), which gives excellent performances in HDLSS settings. We also give a pre-selection method for parameters included in a kernel function without cross-validation. Finally, we check the performance of RSVM and the optimality of the choice in numerical simulation and actual data analyses. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 0361-0918 1532-4141 |
DOI: | 10.1080/03610918.2020.1813300 |