Evaluation of zero counts to better understand the discrepancies between bulk and single-cell RNA-Seq platforms

Recent advances in sample preparation and sequencing technology have made it possible to profile the transcriptomes of individual cells using single-cell RNA sequencing (scRNA-Seq). Compared to bulk RNA-Seq data, single-cell data often contain a higher percentage of zero reads, mainly due to lower s...

Full description

Saved in:
Bibliographic Details
Published inComputational and structural biotechnology journal Vol. 21; pp. 4663 - 4674
Main Authors Zyla, Joanna, Papiez, Anna, Zhao, Jun, Qu, Rihao, Li, Xiaotong, Kluger, Yuval, Polanska, Joanna, Hatzis, Christos, Pusztai, Lajos, Marczyk, Michal
Format Journal Article
LanguageEnglish
Published Netherlands Elsevier B.V 01.01.2023
Research Network of Computational and Structural Biotechnology
Elsevier
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Recent advances in sample preparation and sequencing technology have made it possible to profile the transcriptomes of individual cells using single-cell RNA sequencing (scRNA-Seq). Compared to bulk RNA-Seq data, single-cell data often contain a higher percentage of zero reads, mainly due to lower sequencing depth per cell, which affects mostly measurements of low-expression genes. However, discrepancies between platforms are observed regardless of expression level. Using four paired datasets with multiple samples each, we investigated technical and biological factors that can contribute to this expression shift. Using two separate machine learning models we found that, in addition to expression level, RNA integrity, gene or UTR3 length, and the number of transcripts potentially also influence the occurrence of zeros. These findings could enable the development of novel analytical methods for cross-platform expression shift correction. We also identified genes and biological pathways in our diverse datasets that consistently showed differences when assessed at the single cell versus bulk level to assist in interpreting analysis across transcriptomic platforms. At the gene level, 25 genes (0.12%) were found in all datasets as discordant, but at the pathway level, 7 pathways (2.02%) showed shared enrichment in discordant genes. [Display omitted]
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Present Address: Department of Data Science and Engineering, Silesian University of Technology, Akademicka 16, Gliwice, 44–100, Poland.
ISSN:2001-0370
2001-0370
DOI:10.1016/j.csbj.2023.09.035