Support vector machine and optimal parameter selection for high-dimensional imbalanced data

In this article, we consider asymptotic properties of support vector machine (SVM) in high-dimension, low-sample-size (HDLSS) settings. In particular, we treat high-dimensional imbalanced data. We investigate behaviors of SVM for a regularization parameter C in a framework of kernel functions. We sh...

Full description

Saved in:
Bibliographic Details
Published inCommunications in statistics. Simulation and computation Vol. 51; no. 11; pp. 6739 - 6754
Main Author Nakayama, Yugo
Format Journal Article
LanguageEnglish
Published Philadelphia Taylor & Francis 02.11.2022
Taylor & Francis Ltd
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this article, we consider asymptotic properties of support vector machine (SVM) in high-dimension, low-sample-size (HDLSS) settings. In particular, we treat high-dimensional imbalanced data. We investigate behaviors of SVM for a regularization parameter C in a framework of kernel functions. We show that SVM cannot handle imbalanced classification, and SVM is very biased in HDLSS settings. In order to overcome such difficulties, we propose robust SVM (RSVM), which gives excellent performances in HDLSS settings. We also give a pre-selection method for parameters included in a kernel function without cross-validation. Finally, we check the performance of RSVM and the optimality of the choice in numerical simulation and actual data analyses.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0361-0918
1532-4141
DOI:10.1080/03610918.2020.1813300