A Novel Biclustering Algorithm Based on Differential Sparsity Constraints and Dynamic Graph Regularization for Cancer Gene Expression Data

Biclustering is an effective method for identifying biologically significant gene modules, which aims at extracting gene modules enriched with more information and achieving accurate cancer subtype classification. However, most biclustering algorithms based on sparse singular value decomposition ove...

Full description

Saved in:

Bibliographic Details
Published in	IEEE access Vol. 13; pp. 94681 - 94695
Main Authors	Li, Dan, Song, Peicong, Wang, Jia
Format	Journal Article
Language	English
Published	Piscataway IEEE 2025 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Adaptation models Algorithms biclustering Biological system modeling Cancer Cancer gene expression data Constraints Data models differential sparsity constraints dynamic graph regularization Gene expression Hands Heuristic algorithms Iterative solution Matrix decomposition Modules Regularization Singular value decomposition Sparse matrices Sparsity Vectors
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Biclustering is an effective method for identifying biologically significant gene modules, which aims at extracting gene modules enriched with more information and achieving accurate cancer subtype classification. However, most biclustering algorithms based on sparse singular value decomposition overlook the differences in sparsity between gene and sample dimensions in real cancer gene expression data. Furthermore, biclustering algorithms that incorporate graph-regularized penalties typically fail to adequately address the dynamic update of graph information during the layer-by-layer extraction of biclusters. To address these issues, this paper proposes a novel biclustering algorithm based on differential sparsity constraints and dynamic graph regularization (BCDD). On one hand, considering that the cancer gene expression data contains numerous redundant genes unrelated to the disease, while all samples belong to a specific cancer subtype or come from healthy subjects, the proposed algorithm imposes <inline-formula> <tex-math notation="LaTeX">l_{\mathrm {1/2}} </tex-math></inline-formula>-norm and <inline-formula> <tex-math notation="LaTeX">l_{1} </tex-math></inline-formula>-norm constraints on gene and sample dimensions, respectively, to better capture the differences in sparsity between these two dimensions. On the other hand, to ensure that the graph adjacency matrix can be synchronously updated with expression data during the iterative solution process, a dynamic graph updating strategy based on the change of singular value is designed. This strategy can effectively avoid the interference of graph information corresponding to previously identified biclusters in the subsequent analysis. Experimental results from multiple cancer gene expression datasets demonstrate that the proposed algorithm outperforms other state-of-the-art algorithms in terms of biclustering performance.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2025.3570818