A comparative study of nonlinear manifold learning methods for cancer microarray data classification

► An empirical comparison of nonlinear manifold learning techniques is presented. ► Methods were applied for dimensionality reduction in microarray data classification. ► LLE-based classifier emerged as the most effective method. ► Isomap turned out to be the second best alternative for dimensionali...

Full description

Saved in:
Bibliographic Details
Published inExpert systems with applications Vol. 40; no. 6; pp. 2189 - 2197
Main Authors Orsenigo, Carlotta, Vercellis, Carlo
Format Journal Article
LanguageEnglish
Published Amsterdam Elsevier Ltd 01.05.2013
Elsevier
Subjects
Online AccessGet full text
ISSN0957-4174
1873-6793
DOI10.1016/j.eswa.2012.10.044

Cover

Loading…
More Information
Summary:► An empirical comparison of nonlinear manifold learning techniques is presented. ► Methods were applied for dimensionality reduction in microarray data classification. ► LLE-based classifier emerged as the most effective method. ► Isomap turned out to be the second best alternative for dimensionality reduction. The paper presents an empirical comparison of the most prominent nonlinear manifold learning techniques for dimensionality reduction in the context of high-dimensional microarray data classification. In particular, we assessed the performance of six methods: isometric feature mapping, locally linear embedding, Laplacian eigenmaps, Hessian eigenmaps, local tangent space alignment and maximum variance unfolding. Unlike previous studies on the subject, the experimental framework adopted in this work properly extends to dimensionality reduction the supervised learning paradigm, by regarding the test set as an out-of-sample set of new points which are excluded from the manifold learning process. This in order to avoid a possible overestimate of the classification accuracy which may yield misleading comparative results. The different empirical approach requires the use of a fast and effective out-of-sample embedding method for mapping new high-dimensional data points into an existing reduced space. To this aim we propose to apply multi-output kernel ridge regression, an extension of linear ridge regression based on kernel functions which has been recently presented as a powerful method for out-of-sample projection when combined with a variant of isometric feature mapping. Computational experiments on a wide collection of cancer microarray data sets show that classifiers based on Isomap, LLE and LE were consistently more accurate than those relying on HE, LTSA and MVU. In particular, under different experimental conditions LLE-based classifier emerged as the most effective method whereas Isomap algorithm turned out to be the second best alternative for dimensionality reduction.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ObjectType-Article-1
ObjectType-Feature-2
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2012.10.044