Assumption-checking rather than (just) testing: The importance of visualization and effect size in statistical diagnostics

Statistical methods generally have assumptions (e.g., normality in linear regression models). Violations of these assumptions can cause various issues, like statistical errors and biased estimates, whose impact can range from inconsequential to critical. Accordingly, it is important to check these a...

Full description

Saved in:

Bibliographic Details
Published in	Behavior research methods Vol. 56; no. 2; pp. 826 - 845
Main Author	Shatz, Itamar
Format	Journal Article
Language	English
Published	New York Springer US 01.02.2024
Subjects	Behavioral Science and Psychology Cognitive Psychology Data Visualization Linear Models Psychology Visualization Statistical diagnostics Assumption checks Statistical assumptions Null hypothesis significance testing Graphical methods
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Statistical methods generally have assumptions (e.g., normality in linear regression models). Violations of these assumptions can cause various issues, like statistical errors and biased estimates, whose impact can range from inconsequential to critical. Accordingly, it is important to check these assumptions, but this is often done in a flawed way. Here, I first present a prevalent but problematic approach to diagnostics—testing assumptions using null hypothesis significance tests (e.g., the Shapiro–Wilk test of normality). Then, I consolidate and illustrate the issues with this approach, primarily using simulations. These issues include statistical errors (i.e., false positives, especially with large samples, and false negatives, especially with small samples), false binarity, limited descriptiveness, misinterpretation (e.g., of p -value as an effect size), and potential testing failure due to unmet test assumptions. Finally, I synthesize the implications of these issues for statistical diagnostics, and provide practical recommendations for improving such diagnostics. Key recommendations include maintaining awareness of the issues with assumption tests (while recognizing they can be useful), using appropriate combinations of diagnostic methods (including visualization and effect sizes) while recognizing their limitations, and distinguishing between testing and checking assumptions. Additional recommendations include judging assumption violations as a complex spectrum (rather than a simplistic binary), using programmatic tools that increase replicability and decrease researcher degrees of freedom, and sharing the material and rationale involved in the diagnostics.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1554-3528 1554-351X 1554-3528
DOI:	10.3758/s13428-023-02072-x