Estimating allele-specific expression of SNVs from 10x Genomics Single-Cell RNA-Sequencing Data

With the recent advances in single-cell RNA-sequencing (scRNA-seq) technologies, estimation of allele expression from single cells is becoming increasingly reliable. With the recent advances in single-cell RNA-sequencing (scRNA-seq) technologies, estimation of allele expression from single cells is...

Full description

Saved in:
Bibliographic Details
Published inbioRxiv
Main Authors Nm, Prashant, Liu, Hongyu, Bousounis, Pavlos, Spurr, Liam, Alomran, Nawaf, Ibeawuchi, Helen, Sein, Justin, Reece-Stremtan, Dacian, Horvath, Anelia
Format Paper
LanguageEnglish
Published Cold Spring Harbor Cold Spring Harbor Laboratory Press 23.12.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:With the recent advances in single-cell RNA-sequencing (scRNA-seq) technologies, estimation of allele expression from single cells is becoming increasingly reliable. With the recent advances in single-cell RNA-sequencing (scRNA-seq) technologies, estimation of allele expression from single cells is becoming increasingly reliable. Allele expression is both quantitative and dynamic and is an essential component of the genomic interactome. Here, we systematically estimate allele expression from heterozygous single nucleotide variant (SNV) loci using scRNA-seq data generated on the 10x Genomics platform. We include in the analysis 26,640 human adipose-derived mesenchymal stem cells (from three healthy donors), with an average sequencing reads over 120K/cell (more than 4 billion scRNA-seq reads total). High quality SNV calls assessed in our study contained approximately 15% exonic and >50% intronic loci. To analyze the allele expression, we estimate the expressed Variant Allele Fraction (VAFRNA) from SNV-aware alignments and analyze its variance and distribution (mono- and bi-allelic) at different cutoffs for required minimal number of sequencing reads. Our analysis shows that when assessing SNV loci covered by a minimum of 3 unique sequencing reads, over 50% of the heterozygous SNVs show bi-allelic expression, while at minimum of 10 reads, nearly 90% of the SNVs are bi-allelic. Consistent with single cell studies on RNA velocity and models of transcriptional burst kinetics, we observe a substantially higher rate of monoallelic expression among intronic SNVs, signifying the usefulness of scVAFRNA to assess dynamic cellular processes. Our analysis demonstrates the feasibility of scVAFRNA estimation from current scRNA-seq datasets and shows that the 3-prime-based library generation protocol of 10x Genomics scRNA-seq data can be highly informative in SNV-based analyses.
DOI:10.1101/2019.12.22.886119