Application of a mixture model for determining the cutoff threshold for activity in high-throughput screening
Rapid throughput of assays for assessing biological activity of compounds, known as high-throughput screening (HTS), has created a need for a statistical analysis of the resulting data. Conventional methods for separating active compounds from inactive compounds based on using only a portion of the...
Saved in:
Published in | Computational statistics & data analysis Vol. 51; no. 8; pp. 4002 - 4012 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Amsterdam
Elsevier B.V
01.05.2007
Elsevier Science Elsevier |
Series | Computational Statistics & Data Analysis |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Rapid throughput of assays for assessing biological activity of compounds, known as high-throughput screening (HTS), has created a need for a statistical analysis of the resulting data. Conventional methods for separating active compounds from inactive compounds based on using only a portion of the screening data to determine the cutoff threshold value are ad hoc and unsatisfactory. Taking full advantage of the entire set of screening data, we assume that the responses can be sorted into two classes; measurements associated with inactive compounds and measurements associated with active compounds. Both theoretical and practical considerations lead us to model the distribution of the measurements of inactive compounds with a Gaussian distribution. The choice is consistent with the data and our analytical experience. In our examples, active compounds inhibit enzyme and the distribution of measurements from those compounds can be characterized by a gamma or a similar one-sided long-tailed distribution. The application of this mixture of Gaussian and gamma distributions describes activity of our screening data very well. Using this model, we will show its reasoning and derivation along with describing how to optimally set the cutoff threshold value in a statistically well-defined manner. This modeling approach provides a new and useful statistical tool for HTS. |
---|---|
ISSN: | 0167-9473 1872-7352 |
DOI: | 10.1016/j.csda.2006.12.014 |