Sparse Subspace Clustering: Algorithm, Theory, and Applications

Many real-world problems deal with collections of high-dimensional data, such as images, videos, text, and web documents, DNA microarray data, and more. Often, such high-dimensional data lie close to low-dimensional structures corresponding to several classes or categories to which the data belong....

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on pattern analysis and machine intelligence Vol. 35; no. 11; pp. 2765 - 2781
Main Authors	Elhamifar, E., Vidal, R.
Format	Journal Article
Language	English
Published	Los Alamitos, CA IEEE 01.11.2013 IEEE Computer Society
Subjects	(\ell_1)-minimization Algorithms Applied sciences Artificial Intelligence Biological and medical sciences Biometry - methods clustering Clustering algorithms Computer science; control theory; systems Computer vision convex programming Data processing. List processing. Character string processing Detection, estimation, filtering, equalization, prediction Exact sciences and technology Face Face - anatomy & histology face clustering Fundamental and applied biological sciences. Psychology Gene expression High-dimensional data Humans Image Interpretation, Computer-Assisted - methods Information, signal and communications theory intrinsic low-dimensionality Memory organisation. Data processing Molecular and cellular biology Molecular genetics motion segmentation Noise Optimization Pattern Recognition, Automated - methods principal angles Programming theory Sample Size Signal and communications theory Signal, noise Software Sparse matrices sparse representation spectral clustering subspaces Telecommunications and information theory Theoretical computing Vectors Cluster analysis High-dimensional data Program optimization DNA chip Video signal Convex programming ℓ Vector space Facies principal angles Data distribution Sparse representation Bioinformatics spectral clustering Computer vision Data analysis Dimensionality Spectral method Motion estimation Subspace method motion segmentation Cluster Minimization Text face clustering intrinsic low-dimensionality Image segmentation subspaces Multidimensional database NP hard problem Data models clustering Synthetic data
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Many real-world problems deal with collections of high-dimensional data, such as images, videos, text, and web documents, DNA microarray data, and more. Often, such high-dimensional data lie close to low-dimensional structures corresponding to several classes or categories to which the data belong. In this paper, we propose and study an algorithm, called sparse subspace clustering, to cluster data points that lie in a union of low-dimensional subspaces. The key idea is that, among the infinitely many possible representations of a data point in terms of other points, a sparse representation corresponds to selecting a few points from the same subspace. This motivates solving a sparse optimization program whose solution is used in a spectral clustering framework to infer the clustering of the data into subspaces. Since solving the sparse optimization program is in general NP-hard, we consider a convex relaxation and show that, under appropriate conditions on the arrangement of the subspaces and the distribution of the data, the proposed minimization program succeeds in recovering the desired sparse representations. The proposed algorithm is efficient and can handle data points near the intersections of subspaces. Another key advantage of the proposed algorithm with respect to the state of the art is that it can deal directly with data nuisances, such as noise, sparse outlying entries, and missing entries, by incorporating the model of the data into the sparse optimization program. We demonstrate the effectiveness of the proposed algorithm through experiments on synthetic data as well as the two real-world problems of motion segmentation and face clustering.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0162-8828 1939-3539 2160-9292
DOI:	10.1109/TPAMI.2013.57