The Mixed Subjects Design: Treating Large Language Models as Potentially Informative Observations

Large language models (LLMs) provide cost-effective but possibly inaccurate predictions of human behavior. Despite growing evidence that predicted and observed behavior are often not interchangeable, there is limited guidance on using LLMs to obtain valid estimates of causal effects and other parame...

Full description

Saved in:

Bibliographic Details
Published in	Sociological methods & research Vol. 54; no. 3; pp. 1074 - 1109
Main Authors	Broska, David, Howes, Michael, van Loon, Austin
Format	Journal Article
Language	English
Published	Los Angeles, CA SAGE Publications 01.08.2025 SAGE PUBLICATIONS, INC
Subjects	Behavior Cost analysis Cost Effectiveness Human subjects Humans Inequality Large language models Power structure Predictions Productivity Research subjects prediction-powered inference (PPI) PPI correlation mixed subjects design effective sample size large language models PPI poweranalysis machine learning computational social science
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Large language models (LLMs) provide cost-effective but possibly inaccurate predictions of human behavior. Despite growing evidence that predicted and observed behavior are often not interchangeable, there is limited guidance on using LLMs to obtain valid estimates of causal effects and other parameters. We argue that LLM predictions should be treated as potentially informative observations, while human subjects serve as a gold standard in a mixed subjects design. This paradigm preserves validity and offers more precise estimates at a lower cost than experiments relying exclusively on human subjects. We demonstrate—and extend—prediction-powered inference (PPI), a method that combines predictions and observations. We define the PPI correlation as a measure of interchangeability and derive the effective sample size for PPI. We also introduce a power analysis to optimally choose between informative but costly human subjects and less informative but cheap predictions of human behavior. Mixed subjects designs could enhance scientific productivity and reduce inequality in access to costly evidence.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0049-1241 1552-8294
DOI:	10.1177/00491241251326865