rMATS: Robust and flexible detection of differential alternative splicing from replicate RNA-Seq data

Significance Alternative splicing (AS) is an important mechanism of eukaryotic gene regulation. Deep RNA sequencing (RNA-Seq) has become a powerful approach for quantitative profiling of AS. With the increasing capacity of high-throughput sequencers, it has become common for RNA-Seq studies of AS to...

Full description

Saved in:
Bibliographic Details
Published inProceedings of the National Academy of Sciences - PNAS Vol. 111; no. 51; pp. E5593 - E5601
Main Authors Shen, Shihao, Park, Juw Won, Lu, Zhi-xiang, Lin, Lan, Henry, Michael D, Wu, Ying Nian, Zhou, Qing, Xing, Yi
Format Journal Article
LanguageEnglish
Published United States National Academy of Sciences 23.12.2014
National Acad Sciences
SeriesPNAS Plus
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Significance Alternative splicing (AS) is an important mechanism of eukaryotic gene regulation. Deep RNA sequencing (RNA-Seq) has become a powerful approach for quantitative profiling of AS. With the increasing capacity of high-throughput sequencers, it has become common for RNA-Seq studies of AS to examine multiple biological replicates. We developed rMATS, a new statistical method for robust and flexible detection of differential AS from replicate RNA-Seq data. Besides the analysis of unpaired replicates, rMATS includes a model specifically designed for paired replicates, such as case–control matched pairs in clinical RNA-Seq datasets. We expect rMATS will be useful for genome-wide studies of AS in diverse research projects. Our data also provide new insights about the experimental design for RNA-Seq studies of AS. Ultra-deep RNA sequencing (RNA-Seq) has become a powerful approach for genome-wide analysis of pre-mRNA alternative splicing. We previously developed multivariate analysis of transcript splicing (MATS), a statistical method for detecting differential alternative splicing between two RNA-Seq samples. Here we describe a new statistical model and computer program, replicate MATS (rMATS), designed for detection of differential alternative splicing from replicate RNA-Seq data. rMATS uses a hierarchical model to simultaneously account for sampling uncertainty in individual replicates and variability among replicates. In addition to the analysis of unpaired replicates, rMATS also includes a model specifically designed for paired replicates between sample groups. The hypothesis-testing framework of rMATS is flexible and can assess the statistical significance over any user-defined magnitude of splicing change. The performance of rMATS is evaluated by the analysis of simulated and real RNA-Seq data. rMATS outperformed two existing methods for replicate RNA-Seq data in all simulation settings, and RT-PCR yielded a high validation rate (94%) in an RNA-Seq dataset of prostate cancer cell lines. Our data also provide guiding principles for designing RNA-Seq studies of alternative splicing. We demonstrate that it is essential to incorporate biological replicates in the study design. Of note, pooling RNAs or merging RNA-Seq data from multiple replicates is not an effective approach to account for variability, and the result is particularly sensitive to outliers. The rMATS source code is freely available at rnaseq-mats.sourceforge.net/ . As the popularity of RNA-Seq continues to grow, we expect rMATS will be useful for studies of alternative splicing in diverse RNA-Seq projects.
Bibliography:http://dx.doi.org/10.1073/pnas.1419161111
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Edited by Wing Hung Wong, Stanford University, Stanford, CA, and approved November 3, 2014 (received for review October 7, 2014)
Author contributions: S.S., Y.N.W., Q.Z., and Y.X. designed research; S.S., J.W.P., Z.-x.L., and L.L. performed research; S.S., J.W.P., L.L., and M.D.H. contributed new reagents/analytic tools; S.S., Z.-x.L., and Y.X. analyzed data; and S.S. and Y.X. wrote the paper.
1S.S. and J.W.P. contributed equally to this work.
ISSN:0027-8424
1091-6490
DOI:10.1073/pnas.1419161111