Estimating the strength of expression conservation from high throughput RNA-seq data

Abstract Motivation Evolution of gene across species is usually subject to the stabilizing selection to maintain the optimal expression level. While it is generally accepted that the resulting expression conservation may vary considerably among genes, statistically reliable estimation remains challe...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics Vol. 35; no. 23; pp. 5030 - 5038
Main Authors Gu, Xun, Ruan, Hang, Yang, Jingwen
Format Journal Article
LanguageEnglish
Published England Oxford University Press 01.12.2019
Online AccessGet full text

Cover

Loading…
More Information
Summary:Abstract Motivation Evolution of gene across species is usually subject to the stabilizing selection to maintain the optimal expression level. While it is generally accepted that the resulting expression conservation may vary considerably among genes, statistically reliable estimation remains challenging, due to few species included in current comparative RNA-seq data with high number of unknown parameters. Results In this paper, we develop a gamma distribution model to describe how the strength of expression conservation (denoted by W) varies among genes. Given the high throughput RNA-seq datasets from multiple species, we then formulate an empirical Bayesian procedure to estimate W for each gene. Our case studies showed that those W-estimates are useful to study the evolutionary pattern of expression conservation. Availability and implementation Our method has been implemented in the R-package software, TreeExp, which is publically available at Github develop site https://github.com/hr1912/TreeExp. It involves three functions: estParaGamma, estParaQ and estParaWBayesian. The manual for software TreeExp is available at https://github.com/hr1912/TreeExp/tree/master/vignettes. For any question, one may contact Dr Hang Ruan (Hang.Ruan@uth.tmc.edu).
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1367-4803
1367-4811
1460-2059
1367-4811
DOI:10.1093/bioinformatics/btz405