Reproducibility of R‐fMRI metrics on the impact of different strategies for multiple comparison correction and sample sizes
Concerns regarding reproducibility of resting‐state functional magnetic resonance imaging (R‐fMRI) findings have been raised. Little is known about how to operationally define R‐fMRI reproducibility and to what extent it is affected by multiple comparison correction strategies and sample size. We co...
Saved in:
Published in | Human brain mapping Vol. 39; no. 1; pp. 300 - 318 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
United States
John Wiley & Sons, Inc
01.01.2018
John Wiley and Sons Inc |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Concerns regarding reproducibility of resting‐state functional magnetic resonance imaging (R‐fMRI) findings have been raised. Little is known about how to operationally define R‐fMRI reproducibility and to what extent it is affected by multiple comparison correction strategies and sample size. We comprehensively assessed two aspects of reproducibility, test–retest reliability and replicability, on widely used R‐fMRI metrics in both between‐subject contrasts of sex differences and within‐subject comparisons of eyes‐open and eyes‐closed (EOEC) conditions. We noted permutation test with Threshold‐Free Cluster Enhancement (TFCE), a strict multiple comparison correction strategy, reached the best balance between family‐wise error rate (under 5%) and test–retest reliability/replicability (e.g., 0.68 for test–retest reliability and 0.25 for replicability of amplitude of low‐frequency fluctuations (ALFF) for between‐subject sex differences, 0.49 for replicability of ALFF for within‐subject EOEC differences). Although R‐fMRI indices attained moderate reliabilities, they replicated poorly in distinct datasets (replicability < 0.3 for between‐subject sex differences, < 0.5 for within‐subject EOEC differences). By randomly drawing different sample sizes from a single site, we found reliability, sensitivity and positive predictive value (PPV) rose as sample size increased. Small sample sizes (e.g., < 80 [40 per group]) not only minimized power (sensitivity < 2%), but also decreased the likelihood that significant results reflect “true” effects (PPV < 0.26) in sex differences. Our findings have implications for how to select multiple comparison correction strategies and highlight the importance of sufficiently large sample sizes in R‐fMRI studies to enhance reproducibility. Hum Brain Mapp 39:300–318, 2018. © 2017 Wiley Periodicals, Inc. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 1065-9471 1097-0193 |
DOI: | 10.1002/hbm.23843 |