A direct approach to sparse discriminant analysis in ultra-high dimensions

Sparse discriminant methods based on independence rules, such as the nearest shrunken centroids classifier (Tibshirani et al., 2002) and features annealed independence rules (Fan & Fan, 2008), have been proposed as computationally attractive tools for feature selection and classification with hi...

Full description

Saved in:
Bibliographic Details
Published inBiometrika Vol. 99; no. 1; pp. 29 - 42
Main Authors Mai, Qing, Zou, Hui, Yuan, Ming
Format Journal Article
LanguageEnglish
Published Oxford University Press for Biometrika Trust 01.03.2012
SeriesBiometrika
Online AccessGet more information

Cover

Loading…
More Information
Summary:Sparse discriminant methods based on independence rules, such as the nearest shrunken centroids classifier (Tibshirani et al., 2002) and features annealed independence rules (Fan & Fan, 2008), have been proposed as computationally attractive tools for feature selection and classification with high-dimensional data. A fundamental drawback of these rules is that they ignore correlations among features and thus could produce misleading feature selection and inferior classification. We propose a new procedure for sparse discriminant analysis, motivated by the least squares formulation of linear discriminant analysis. To demonstrate our proposal, we study the numerical and theoretical properties of discriminant analysis constructed via lasso penalized least squares. Our theory shows that the method proposed can consistently identify the subset of discriminative features contributing to the Bayes rule and at the same time consistently estimate the Bayes classification direction, even when the dimension can grow faster than any polynomial order of the sample size. The theory allows for general dependence among features. Simulated and real data examples show that lassoed discriminant analysis compares favourably with other popular sparse discriminant proposals. Copyright 2012, Oxford University Press.
ISSN:0006-3444
1464-3510
DOI:10.1093/biomet/asr066