A common misapplication of statistical inference: Nuisance control with null-hypothesis significance tests

•Researchers use statistical tests of stimulus or subjects characteristics to “control for confounds”.•This practice is conceptually misguided and pragmatically useless.•We discuss the problem and alternatives. Experimental research on behavior and cognition frequently rests on stimulus or subject s...

Full description

Saved in:

Bibliographic Details
Published in	Brain and language Vol. 162; pp. 42 - 45
Main Authors	Sassenhagen, Jona, Alday, Phillip M.
Format	Journal Article
Language	English
Published	Netherlands Elsevier Inc 01.11.2016 Academic Press
Subjects	Behavioral Research - methods Cognition Confounding Factors (Epidemiology) Humans Inference Language Miscommunication Models, Statistical Statistical inference Word frequency Word length
Online Access	Get full text

Cover

Loading…

More Information
Summary:	•Researchers use statistical tests of stimulus or subjects characteristics to “control for confounds”.•This practice is conceptually misguided and pragmatically useless.•We discuss the problem and alternatives. Experimental research on behavior and cognition frequently rests on stimulus or subject selection where not all characteristics can be fully controlled, even when attempting strict matching. For example, when contrasting patients to controls, variables such as intelligence or socioeconomic status are often correlated with patient status. Similarly, when presenting word stimuli, variables such as word frequency are often correlated with primary variables of interest. One procedure very commonly employed to control for such nuisance effects is conducting inferential tests on confounding stimulus or subject characteristics. For example, if word length is not significantly different for two stimulus sets, they are considered as matched for word length. Such a test has high error rates and is conceptually misguided. It reflects a common misunderstanding of statistical tests: interpreting significance not to refer to inference about a particular population parameter, but about 1. the sample in question, 2. the practical relevance of a sample difference (so that a nonsignificant test is taken to indicate evidence for the absence of relevant differences). We show inferential testing for assessing nuisance effects to be inappropriate both pragmatically and philosophically, present a survey showing its high prevalence, and briefly discuss an alternative in the form of regression including nuisance variables.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	0093-934X 1090-2155 1090-2155
DOI:	10.1016/j.bandl.2016.08.001