Supervised discriminative dimensionality reduction by learning multiple transformation operators

Analyzing and learning from high dimensional data have always been challenging in machine learning tasks, causing serious computational complexities and poor learning performances. Supervised dimensionality reduction is a popular technique to address such challenges in supervised learning tasks, whe...

Full description

Saved in:

Bibliographic Details
Published in	Expert systems with applications Vol. 164; p. 113958
Main Authors	Rajabzadeh, Hossein, Jahromi, Mansoor Zolghadri, Ghodsi, Ali
Format	Journal Article
Language	English
Published	New York Elsevier Ltd 01.02.2021 Elsevier BV
Subjects	Cognitive tasks Discrimination Machine learning Optimization Reduction Task complexity Transformations
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Analyzing and learning from high dimensional data have always been challenging in machine learning tasks, causing serious computational complexities and poor learning performances. Supervised dimensionality reduction is a popular technique to address such challenges in supervised learning tasks, where data are accompanied with labels. Traditionally, such techniques mostly learn one single transformation to project data into a low-dimensional discriminative subspace. However, learning only one transformation for the whole data could be dominated by one or several classes, and the rest of classes receive less discrimination in the reduced space. That is to say, learning one transformation is insufficient to properly discriminate classes of data in the reduced space because they may have complex and completely dissimilar distributions. This insufficiency becomes even more serious if the number of classes increases, leading to poor discrimination and lessening the learning performance in the reduced space. To overcome this limitation, we propose a novel supervised dimensionality reduction method, which learns per-class transformations by optimizing a newly designed and efficient objective function. The proposed method captures more discriminative information from each single class of data compared to the case of one single transformation. Moreover, the proposed objective function enjoys several desirable properties: (1) maximizing margins between the transformed classes of data, (2) having a closed-form solution, (3) being easily kernelized in the case of nonlinear data,(4) preventing overfitting, and (5) ensuring the transformations are sparse in rows so that discriminative features are learned in the reduced space. Experimental results verify that the proposed method is superior to the related state-of-the-art methods and promising in generating discriminative embeddings. •Learning per-class transformations while enjoying a closed-form solution.•Adding L2,1 norm to avoid overfitting while ensuring the transformation to be sparse.•Learning discriminative features in the transformed space.•Constructing the kernelized version of the proposed method.•Providing a proof-convergence for the regularized version of the proposed method.
ISSN:	0957-4174 1873-6793
DOI:	10.1016/j.eswa.2020.113958