Reproducibility of R‐fMRI metrics on the impact of different strategies for multiple comparison correction and sample sizes

Concerns regarding reproducibility of resting‐state functional magnetic resonance imaging (R‐fMRI) findings have been raised. Little is known about how to operationally define R‐fMRI reproducibility and to what extent it is affected by multiple comparison correction strategies and sample size. We co...

Full description

Saved in:
Bibliographic Details
Published inHuman brain mapping Vol. 39; no. 1; pp. 300 - 318
Main Authors Chen, Xiao, Lu, Bin, Yan, Chao‐Gan
Format Journal Article
LanguageEnglish
Published United States John Wiley & Sons, Inc 01.01.2018
John Wiley and Sons Inc
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Concerns regarding reproducibility of resting‐state functional magnetic resonance imaging (R‐fMRI) findings have been raised. Little is known about how to operationally define R‐fMRI reproducibility and to what extent it is affected by multiple comparison correction strategies and sample size. We comprehensively assessed two aspects of reproducibility, test–retest reliability and replicability, on widely used R‐fMRI metrics in both between‐subject contrasts of sex differences and within‐subject comparisons of eyes‐open and eyes‐closed (EOEC) conditions. We noted permutation test with Threshold‐Free Cluster Enhancement (TFCE), a strict multiple comparison correction strategy, reached the best balance between family‐wise error rate (under 5%) and test–retest reliability/replicability (e.g., 0.68 for test–retest reliability and 0.25 for replicability of amplitude of low‐frequency fluctuations (ALFF) for between‐subject sex differences, 0.49 for replicability of ALFF for within‐subject EOEC differences). Although R‐fMRI indices attained moderate reliabilities, they replicated poorly in distinct datasets (replicability < 0.3 for between‐subject sex differences, < 0.5 for within‐subject EOEC differences). By randomly drawing different sample sizes from a single site, we found reliability, sensitivity and positive predictive value (PPV) rose as sample size increased. Small sample sizes (e.g., < 80 [40 per group]) not only minimized power (sensitivity < 2%), but also decreased the likelihood that significant results reflect “true” effects (PPV < 0.26) in sex differences. Our findings have implications for how to select multiple comparison correction strategies and highlight the importance of sufficiently large sample sizes in R‐fMRI studies to enhance reproducibility. Hum Brain Mapp 39:300–318, 2018. © 2017 Wiley Periodicals, Inc.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1065-9471
1097-0193
DOI:10.1002/hbm.23843