What Is an Intracluster Correlation Coefficient? Crucial Concepts for Primary Care Researchers

Primary care research often involves clustered samples in which subjects are randomized at a group level but analyzed at an individual level. Analyses that do not take this clustering into account may report significance where none exists. This article explores the causes, consequences, and implicat...

Full description

Saved in:
Bibliographic Details
Published inAnnals of family medicine Vol. 2; no. 3; pp. 204 - 208
Main Author Killip, S.
Format Journal Article
LanguageEnglish
Published United States Copyright 2004 Annals of Family Medicine, Inc 01.05.2004
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Primary care research often involves clustered samples in which subjects are randomized at a group level but analyzed at an individual level. Analyses that do not take this clustering into account may report significance where none exists. This article explores the causes, consequences, and implications of cluster data. Using a case study with accompanying equations, we show that clustered samples are not as statistically efficient as simple random samples. Similarity among subjects within preexisting groups or clusters reduces the variability of responses in a clustered sample, which erodes the power to detect true differences between study arms. This similarity is expressed by the intracluster correlation coefficient, or p (rho), which compares the within-group variance with the between-group variance. Rho is used in equations along with the cluster size and the number of clusters to calculate the effective sample size (ESS) in a clustered design. The ESS should be used to calculate power in the design phase of a clustered study. Appropriate accounting for similarities among subjects in a cluster almost always results in a net loss of power, requiring increased total subject recruitment. Increasing the number of clusters enhances power more efficiently than does increasing the number of subjects within a cluster. Primary care research frequently uses clustered designs, whether consciously or unconsciously. Researchers must recognize and understand the implications of clusters to avoid costly sample size errors.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Funding support: This work was supported by grant # 1 D14 HP 00041 from the Health Resources and Services Administration.
Conflict of interest: none reported
Previous presentations: This work was presented by Dr. Shersten Kil-lip on December 6, 2002, at the Primary Care Research Methods and Statistics Conference in San Antonio, Tex, and on March 22, 2003, at the 2003 Convocation of Practices (hosted by the AAFP National Network for Family Practice and Primary Care Research and the Federation of Practice-Based Research Networks) in Arlington, Va. Both presentations were under the title “What Is an Intracluster Correlation Coefficient? Crucial Concepts for Novice PBRN Researchers.”
CORRESPONDING AUTHOR: Shersten Killip, MD, MPH, K-302 Kentucky Clinic 0284, 740 S. Limestone, Lexington, KY 40536-0284, skill2@email.uky.edu
ISSN:1544-1709
1544-1717
DOI:10.1370/afm.141