Joint modeling of reaction times and choice improves parameter identifiability in reinforcement learning models

•Parameters of reinforcement learning models are particularly difficult to estimate.•Incorporating reaction times into model fitting improves parameter identifiability.•Bayesian weighting of choice and reaction times improves the power of analyses assessing learning rate. Reinforcement learning mode...

Full description

Saved in:

Bibliographic Details
Published in	Journal of neuroscience methods Vol. 317; pp. 37 - 44
Main Authors	Ballard, Ian C., McClure, Samuel M.
Format	Journal Article
Language	English
Published	Netherlands Elsevier B.V 01.04.2019
Subjects	Animals Bayes Theorem Choice Behavior Delay discounting Humans Intertemporal choice Models, Neurological Models, Psychological Parameter estimation Power Q-learning Reaction Time Reinforcement, Psychology Reproducibility Striatum Striatum Q-learning Parameter estimation Reproducibility Delay discounting Intertemporal choice Power
Online Access	Get full text

Cover

Loading…

More Information
Summary:	•Parameters of reinforcement learning models are particularly difficult to estimate.•Incorporating reaction times into model fitting improves parameter identifiability.•Bayesian weighting of choice and reaction times improves the power of analyses assessing learning rate. Reinforcement learning models provide excellent descriptions of learning in multiple species across a variety of tasks. Many researchers are interested in relating parameters of reinforcement learning models to neural measures, psychological variables or experimental manipulations. We demonstrate that parameter identification is difficult because a range of parameter values provide approximately equal quality fits to data. This identification problem has a large impact on power: we show that a researcher who wants to detect a medium sized correlation (r = .3) with 80% power between a variable and learning rate must collect 60% more subjects than specified by a typical power analysis in order to account for the noise introduced by model fitting. We derive a Bayesian optimal model fitting technique that takes advantage of information contained in choices and reaction times to constrain parameter estimates. We show using simulation and empirical data that this method substantially improves the ability to recover learning rates. We compare this method against the use of Bayesian priors. We show in simulations that the combined use of Bayesian priors and reaction times confers the highest parameter identifiability. However, in real data where the priors may have been misspecified, the use of Bayesian priors interferes with the ability of reaction time data to improve parameter identifiability. We present a simple technique that takes advantage of readily available data to substantially improve the quality of inferences that can be drawn from parameters of reinforcement learning models.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 IB conceptualized the study, conducted the analysis, and wrote the manuscript. SM provided supervision and critical revisions. Author contributions
ISSN:	0165-0270 1872-678X 1872-678X
DOI:	10.1016/j.jneumeth.2019.01.006