Support vector machine and optimal parameter selection for high-dimensional imbalanced data

In this article, we consider asymptotic properties of support vector machine (SVM) in high-dimension, low-sample-size (HDLSS) settings. In particular, we treat high-dimensional imbalanced data. We investigate behaviors of SVM for a regularization parameter C in a framework of kernel functions. We sh...

Full description

Saved in:

Bibliographic Details
Published in	Communications in statistics. Simulation and computation Vol. 51; no. 11; pp. 6739 - 6754
Main Author	Nakayama, Yugo
Format	Journal Article
Language	English
Published	Philadelphia Taylor & Francis 02.11.2022 Taylor & Francis Ltd
Subjects	Asymptotic properties Classification HDLSS Imbalanced data Kernel functions Optimization Parameters Primary 62H30 Regularization Robustness (mathematics) Secondary 62G20 Support vector machine Support vector machines
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this article, we consider asymptotic properties of support vector machine (SVM) in high-dimension, low-sample-size (HDLSS) settings. In particular, we treat high-dimensional imbalanced data. We investigate behaviors of SVM for a regularization parameter C in a framework of kernel functions. We show that SVM cannot handle imbalanced classification, and SVM is very biased in HDLSS settings. In order to overcome such difficulties, we propose robust SVM (RSVM), which gives excellent performances in HDLSS settings. We also give a pre-selection method for parameters included in a kernel function without cross-validation. Finally, we check the performance of RSVM and the optimality of the choice in numerical simulation and actual data analyses.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0361-0918 1532-4141
DOI:	10.1080/03610918.2020.1813300