The complexity of measuring reliability in learning tasks: An illustration using the Alternating Serial Reaction Time Task

Despite the fact that reliability estimation is crucial for robust inference, it is underutilized in neuroscience and cognitive psychology. Appreciating reliability can help researchers increase statistical power, effect sizes, and reproducibility, decrease the impact of measurement error, and infor...

Full description

Saved in:
Bibliographic Details
Published inBehavior research methods Vol. 56; no. 1; pp. 301 - 317
Main Authors Farkas, Bence C., Krajcsi, Attila, Janacsek, Karolina, Nemeth, Dezso
Format Journal Article
LanguageEnglish
Published New York Springer US 01.01.2024
Springer Nature B.V
Psychonomic Society, Inc
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Despite the fact that reliability estimation is crucial for robust inference, it is underutilized in neuroscience and cognitive psychology. Appreciating reliability can help researchers increase statistical power, effect sizes, and reproducibility, decrease the impact of measurement error, and inform methodological choices. However, accurately calculating reliability for many experimental learning tasks is challenging. In this study, we highlight a number of these issues, and estimate multiple metrics of internal consistency and split-half reliability of a widely used learning task on a large sample of 180 subjects. We show how pre-processing choices, task length, and sample size can affect reliability and its estimation. Our results show that the Alternating Serial Reaction Time Task has respectable reliability, especially when learning scores are calculated based on reaction times and two-stage averaging. We also show that a task length of 25 blocks can be sufficient to meet the usual thresholds for minimally acceptable reliability. We further illustrate how relying on a single point estimate of reliability can be misleading, and the calculation of multiple metrics, along with their uncertainties, can lead to a more complete characterization of the psychometric properties of tasks.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
PMCID: PMC10794483
ISSN:1554-3528
1554-351X
1554-3528
DOI:10.3758/s13428-022-02038-5