The exact joint distribution of the sum of heads and apparent size statistics of a "tandem repeats finder" algorithm
Tandem repeats play many important roles in biological research. However, accurate characterization of their properties is limited by the inability to easily detect them. For this reason, much work has been devoted to developing detection algorithms. A widely used algorithm for detecting tandem repe...
Saved in:
Published in | Bulletin of mathematical biology Vol. 68; no. 8; pp. 2353 - 2364 |
---|---|
Main Author | |
Format | Journal Article |
Language | English |
Published |
United States
Springer Nature B.V
01.11.2006
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Tandem repeats play many important roles in biological research. However, accurate characterization of their properties is limited by the inability to easily detect them. For this reason, much work has been devoted to developing detection algorithms. A widely used algorithm for detecting tandem repeats is the "tandem repeats finder'' (Benson, G., Nucleic Acids Res. 27, 573-580, 1999). In that algorithm, tandem repeats are modeled by percent matches and frequency of indels between adjacent pattern copies, and statistical criteria are used to recognize them. We give a method for computing the exact joint distribution of a pair of statistics that are used in the testing procedures of the "tandem repeats finder'': the total number of matches in matching tuples of length k or longer, and the total number of observations from the beginning of the first such matching tuple to the end of the last one. This allows the computation of the conditional distribution of the latter statistic given the former, a conditional distribution that is used to test for tandem repeats as opposed to non-tandem direct repeats. The setting is a Markovian sequence of a general order. Current approaches to this distributional problem deal only with independent trials and are based on approximations via simulation. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0092-8240 1522-9602 |
DOI: | 10.1007/s11538-006-9146-0 |