SCMcluster: a high-precision cell clustering algorithm integrating marker gene set with single-cell RNA sequencing data

Abstract Single-cell clustering is the most significant part of single-cell RNA sequencing (scRNA-seq) data analysis. One main issue facing the scRNA-seq data is noise and sparsity, which poses a great challenge for the advance of high-precision clustering algorithms. This study adopts cellular mark...

Full description

Saved in:
Bibliographic Details
Published inBriefings in functional genomics Vol. 22; no. 4; pp. 329 - 340
Main Authors Wu, Hao, Zhou, Haoru, Zhou, Bing, Wang, Meili
Format Journal Article
LanguageEnglish
Published England Oxford University Press 17.07.2023
Subjects
Online AccessGet full text
ISSN2041-2649
2041-2657
2041-2657
DOI10.1093/bfgp/elad004

Cover

Loading…
More Information
Summary:Abstract Single-cell clustering is the most significant part of single-cell RNA sequencing (scRNA-seq) data analysis. One main issue facing the scRNA-seq data is noise and sparsity, which poses a great challenge for the advance of high-precision clustering algorithms. This study adopts cellular markers to identify differences between cells, which contributes to feature extraction of single cells. In this work, we propose a high-precision single-cell clustering algorithm-SCMcluster (single-cell cluster using marker genes). This algorithm integrates two cell marker databases(CellMarker database and PanglaoDB database) with scRNA-seq data for feature extraction and constructs an ensemble clustering model based on the consensus matrix. We test the efficiency of this algorithm and compare it with other eight popular clustering algorithms on two scRNA-seq datasets derived from human and mouse tissues, respectively. The experimental results show that SCMcluster outperforms the existing methods in both feature extraction and clustering performance. The source code of SCMcluster is available for free at https://github.com/HaoWuLab-Bioinformatics/SCMcluster.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2041-2649
2041-2657
2041-2657
DOI:10.1093/bfgp/elad004