Identifying Graphical Models

The ability to identify reliably a positive or negative partial correlation between the expression levels of two genes is influenced by the number \(p\) of genes, the number \(n\) of analyzed samples, and the statistical properties of the measurements. Classical statistical theory teaches that the p...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Shevlyakova, Maya, Morgenthaler, Stephan
Format Paper
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 23.09.2013
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The ability to identify reliably a positive or negative partial correlation between the expression levels of two genes is influenced by the number \(p\) of genes, the number \(n\) of analyzed samples, and the statistical properties of the measurements. Classical statistical theory teaches that the product of the root sample size multiplied by the size of the partial correlation is the crucial quantity. But this has to be combined with some adjustment for multiplicity depending on \(p\), which makes the classical analysis somewhat arbitrary. We investigate this problem through the lens of the Kullback-Leibler divergence, which is a measure of the average information for detecting an effect. We conclude that commonly sized studies in genetical epidemiology are not able to reliably detect moderately strong links.
ISSN:2331-8422