Application of the Gini Correlation Coefficient to Infer Regulatory Relationships in Transcriptome Analysis

One of the computational challenges in plant systems biology is to accurately infer transcriptional regulation relationships based on correlation analyses of gene expression patterns. Despite several correlation methods that are applied in biology to analyze microarray data, concerns regarding the c...

Full description

Saved in:
Bibliographic Details
Published inPlant physiology (Bethesda) Vol. 160; no. 1; pp. 192 - 203
Main Authors Ma, Chuang, Wang, Xiangfeng
Format Journal Article
LanguageEnglish
Published Rockville, MD American Society of Plant Biologists 01.09.2012
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:One of the computational challenges in plant systems biology is to accurately infer transcriptional regulation relationships based on correlation analyses of gene expression patterns. Despite several correlation methods that are applied in biology to analyze microarray data, concerns regarding the compatibility of these methods with the gene expression data profiled by high-throughput RNA transcriptome sequencing (RNA-Seq) technology have been raised. These concerns are mainly due to the fact that the distribution of read counts in RNA-Seq experiments is different from that of fluorescence intensities in microarray experiments. Therefore, a comprehensive evaluation of the existing correlation methods and, if necessary, introduction of novel methods into biology is appropriate. In this study, we compared four existing correlation methods used in microarray analysis and one novel method called the Gini correlation coefficient on previously published microarray-based and sequencing-based gene expression data in Arabidopsis (Arabidopsis thaliana) and maize (Zea mays). The comparisons were performed on more than 11,000 regulatory relationships in Arabidopsis, including 8,929 pairs of transcription factors and target genes. Our analyses pinpointed the strengths and weaknesses of each method and indicated that the Gini correlation can compensate for the shortcomings of the Pearson correlation, the Spearman correlation, the Kendall correlation, and the Tukey's biweight correlation. The Gini correlation method, with the other four evaluated methods in this study, was implemented as an R package named rsgcc that can be utilized as an alternative option for biologists to perform clustering analyses of gene expression patterns or transcriptional network analyses.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ObjectType-Undefined-3
ObjectType-Article-1
ObjectType-Feature-2
The online version of this article contains Web-only data.
Open Access articles can be viewed online without a subscription.
www.plantphysiol.org/cgi/doi/10.1104/pp.112.201962
The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantphysiol.org) is: Xiangfeng Wang (xwang1@cals.arizona.edu).
ISSN:0032-0889
1532-2548
1532-2548
DOI:10.1104/pp.112.201962