Hyper-Graph Regularized Constrained NMF for Selecting Differentially Expressed Genes and Tumor Classification

Non-negative Matrix Factorization (NMF) is a dimensionality reduction approach for learning a parts-based and linear representation of non-negative data. It has attracted more attention because of that. In practice, NMF not only neglects the manifold structure of data samples, but also overlooks the...

Full description

Saved in:
Bibliographic Details
Published inIEEE journal of biomedical and health informatics Vol. 24; no. 10; pp. 3002 - 3011
Main Authors Jiao, Cui-Na, Gao, Ying-Lian, Yu, Na, Liu, Jin-Xing, Qi, Lian-Yong
Format Journal Article
LanguageEnglish
Published United States IEEE 01.10.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Non-negative Matrix Factorization (NMF) is a dimensionality reduction approach for learning a parts-based and linear representation of non-negative data. It has attracted more attention because of that. In practice, NMF not only neglects the manifold structure of data samples, but also overlooks the priori label information of different classes. In this paper, a novel matrix decomposition method called Hyper-graph regularized Constrained Non-negative Matrix Factorization (HCNMF) is proposed for selecting differentially expressed genes and tumor sample classification. The advantage of hyper-graph learning is to capture local spatial information in high dimensional data. This method incorporates a hyper-graph regularization constraint to consider the higher order data sample relationships. The application of hyper-graph theory can effectively find pathogenic genes in cancer datasets. Besides, the label information is further incorporated in the objective function to improve the discriminative ability of the decomposition matrix. Supervised learning with label information greatly improves the classification effect. We also provide the iterative update rules and convergence proofs for the optimization problems of HCNMF. Experiments under The Cancer Genome Atlas (TCGA) datasets confirm the superiority of HCNMF algorithm compared with other representative algorithms through a set of evaluations.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2168-2194
2168-2208
2168-2208
DOI:10.1109/JBHI.2020.2975199