Regression to the mean for the bivariate binomial distribution

Regression to the mean (RTM) occurs when subjects having relatively high or low measurements are remeasured and found closer to the population mean. This phenomenon can potentially lead to an inaccurate conclusion in a pre‐post study design. Expressions are available for quantifying RTM when the dis...

Full description

Saved in:
Bibliographic Details
Published inStatistics in medicine Vol. 38; no. 13; pp. 2391 - 2412
Main Authors Khan, Manzoor, Olivier, Jake
Format Journal Article
LanguageEnglish
Published England Wiley Subscription Services, Inc 15.06.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Regression to the mean (RTM) occurs when subjects having relatively high or low measurements are remeasured and found closer to the population mean. This phenomenon can potentially lead to an inaccurate conclusion in a pre‐post study design. Expressions are available for quantifying RTM when the distribution of pre and post observations are bivariate normal and bivariate Poisson. However, situations exist where the response variables are the number of successes in a fixed number of trials and follow the bivariate binomial distribution. In this article, expressions for quantifying RTM effects are derived when the underlying distribution is the bivariate binomial. Unlike the normal and Poisson distributions, the correlation between pre and post observations can be either negative or positive under the bivariate binomial distribution and the severity of RTM is greater in the former case. The percentage relative difference is used to highlight the differences in quantifying RTM under the bivariate binomial distribution and normal and Poisson approximations to the bivariate binomial distribution. Expressions for estimating RTM using the method of maximum likelihood along with its asymptotic distribution are derived. A simulation study is conducted to empirically assess the statistical properties of the RTM estimator and its asymptotic distribution. Data examples using the number of obese individuals and the number of nonconforming cardboard cans are discussed.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0277-6715
1097-0258
1097-0258
DOI:10.1002/sim.8115