A Coincidence-Based Test for Uniformity Given Very Sparsely Sampled Discrete Data
How many independent samples N do we need from a distribution p to decide that p is epsiv-distant from uniform in an L 1 sense, Sigma i=1 m | p ( i ) - 1/ m | > epsiv? (Here m is the number of bins on which the distribution is supported, and is assumed known a priori .) Somewhat surprisingly, we...
Saved in:
Published in | IEEE transactions on information theory Vol. 54; no. 10; pp. 4750 - 4755 |
---|---|
Main Author | |
Format | Journal Article |
Language | English |
Published |
New York, NY
IEEE
01.10.2008
Institute of Electrical and Electronics Engineers The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | How many independent samples N do we need from a distribution p to decide that p is epsiv-distant from uniform in an L 1 sense, Sigma i=1 m | p ( i ) - 1/ m | > epsiv? (Here m is the number of bins on which the distribution is supported, and is assumed known a priori .) Somewhat surprisingly, we only need N epsiv 2 Gt m 1/2 to make this decision reliably (this condition is both sufficient and necessary). The test for uniformity introduced here is based on the number of observed ldquocoincidencesrdquo (samples that fall into the same bin), the mean and variance of which may be computed explicitly for the uniform distribution and bounded nonparametrically for any distribution that is known to be epsiv-distant from uniform. Some connections to the classical birthday problem are noted. |
---|---|
Bibliography: | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 |
ISSN: | 0018-9448 1557-9654 |
DOI: | 10.1109/TIT.2008.928987 |