Structured Sparse Non-Negative Matrix Factorization With [Formula Omitted]-Norm
Non-negative matrix factorization (NMF) is a powerful tool for dimensionality reduction and clustering. However, the interpretation of the clustering result from NMF is difficult, especially for the high-dimensional biological data without effective feature selection. To address this problem, we int...
Saved in:
Published in | IEEE transactions on knowledge and data engineering Vol. 35; no. 8; p. 8584 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
New York
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
01.01.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Non-negative matrix factorization (NMF) is a powerful tool for dimensionality reduction and clustering. However, the interpretation of the clustering result from NMF is difficult, especially for the high-dimensional biological data without effective feature selection. To address this problem, we introduce a row-sparse NMF with [Formula Omitted]-norm constraint (NMF_[Formula Omitted]), where the basis matrix [Formula Omitted] is constrained by using the [Formula Omitted]-norm constraint such that [Formula Omitted] has a row-sparsity pattern with feature selection. However, it is a challenge to solve the model, because the [Formula Omitted]-norm constraint is a non-convex and non-smooth function. Fortunately, we prove that the [Formula Omitted]-norm constraint satisfies the Kurdyka-Łojasiewicz property. Based on this finding, we present a proximal alternating linearized minimization algorithm and its monotone accelerated version to solve the NMF_[Formula Omitted] model. In addition, we further present a orthogonal NMF with [Formula Omitted]-norm constraint (ONMF_[Formula Omitted]) to enhance the clustering performance by using a non-negative orthogonal constraint. The ONMF_[Formula Omitted] model is solved by transforming into a series of constrained and penalized matrix factorization problems. The convergence and guarantees for these proposed algorithms are proved and the computational complexity is well evaluated. The results on numerical and scRNA-seq datasets demonstrate the efficiency of our methods in comparison with existing methods. |
---|---|
ISSN: | 1041-4347 1558-2191 |
DOI: | 10.1109/TKDE.2022.3206881 |