SCMcluster: a high-precision cell clustering algorithm integrating marker gene set with single-cell RNA sequencing data
Abstract Single-cell clustering is the most significant part of single-cell RNA sequencing (scRNA-seq) data analysis. One main issue facing the scRNA-seq data is noise and sparsity, which poses a great challenge for the advance of high-precision clustering algorithms. This study adopts cellular mark...
Saved in:
Published in | Briefings in functional genomics Vol. 22; no. 4; pp. 329 - 340 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
England
Oxford University Press
17.07.2023
|
Subjects | |
Online Access | Get full text |
ISSN | 2041-2649 2041-2657 2041-2657 |
DOI | 10.1093/bfgp/elad004 |
Cover
Loading…
Summary: | Abstract
Single-cell clustering is the most significant part of single-cell RNA sequencing (scRNA-seq) data analysis. One main issue facing the scRNA-seq data is noise and sparsity, which poses a great challenge for the advance of high-precision clustering algorithms. This study adopts cellular markers to identify differences between cells, which contributes to feature extraction of single cells. In this work, we propose a high-precision single-cell clustering algorithm-SCMcluster (single-cell cluster using marker genes). This algorithm integrates two cell marker databases(CellMarker database and PanglaoDB database) with scRNA-seq data for feature extraction and constructs an ensemble clustering model based on the consensus matrix. We test the efficiency of this algorithm and compare it with other eight popular clustering algorithms on two scRNA-seq datasets derived from human and mouse tissues, respectively. The experimental results show that SCMcluster outperforms the existing methods in both feature extraction and clustering performance. The source code of SCMcluster is available for free at https://github.com/HaoWuLab-Bioinformatics/SCMcluster. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 2041-2649 2041-2657 2041-2657 |
DOI: | 10.1093/bfgp/elad004 |