Subjects classification from high-dimensional and small-sample size datasets using a strategy based on Clustering Variables around Latent Components (CLV) method
High-dimensional complex systems can be studied through multivariate analysis, as Principal Component Analysis, however large samples of observations frequently are needed for it. Here it is examined a method for small samples based on clustering variables around latent variables (CLV) to subject cl...
Saved in:
Main Author | |
---|---|
Format | Journal Article |
Language | English |
Published |
14.06.2017
|
Subjects | |
Online Access | Get full text |
DOI | 10.48550/arxiv.1706.04633 |
Cover
Loading…
Summary: | High-dimensional complex systems can be studied through multivariate
analysis, as Principal Component Analysis, however large samples of
observations frequently are needed for it. Here it is examined a method for
small samples based on clustering variables around latent variables (CLV) to
subject classification in two presumed groups. For it, a predictive model was
developed to generate datasets with two groups of cases whose variables show
randomness features (up to 30% of variables manifest difference between groups,
and up to 7% of those are correlated between them). The method recovered the
information of the latent factors to classify the subjects with 80 to 95% of
agreement, with positive relationship between the classifier precision and the
rate [number of variables / number of subjects]. |
---|---|
DOI: | 10.48550/arxiv.1706.04633 |