Reference-point centering and range-adaptation enhance human reinforcement learning at the cost of irrational preferences

In economics and perceptual decision-making contextual effects are well documented, where decision weights are adjusted as a function of the distribution of stimuli. Yet, in reinforcement learning literature whether and how contextual information pertaining to decision states is integrated in learni...

Full description

Saved in:

Bibliographic Details
Published in	Nature communications Vol. 9; no. 1; pp. 4503 - 12
Main Authors	Bavard, Sophie, Lebreton, Maël, Khamassi, Mehdi, Coricelli, Giorgio, Palminteri, Stefano
Format	Journal Article
Language	English
Published	London Nature Publishing Group UK 29.10.2018 Nature Publishing Group Nature Portfolio
Subjects	631/378 631/477 Adaptation Adolescent Adult Algorithms Attention Behavior - physiology Computer applications Computer Simulation Decision making Decision Making - physiology Dependence Female Humanities and Social Sciences Humans Learning - physiology Learning algorithms Learning behavior Life Sciences Machine learning Male Models, Neurological multidisciplinary Reference Values Reinforcement Reinforcement (Psychology) Reward Science Science (multidisciplinary) Substrates Valuation Young Adult
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In economics and perceptual decision-making contextual effects are well documented, where decision weights are adjusted as a function of the distribution of stimuli. Yet, in reinforcement learning literature whether and how contextual information pertaining to decision states is integrated in learning algorithms has received comparably little attention. Here, we investigate reinforcement learning behavior and its computational substrates in a task where we orthogonally manipulate outcome valence and magnitude, resulting in systematic variations in state-values. Model comparison indicates that subjects’ behavior is best accounted for by an algorithm which includes both reference point-dependence and range-adaptation—two crucial features of state-dependent valuation. In addition, we find that state-dependent outcome valuation progressively emerges, is favored by increasing outcome information and correlated with explicit understanding of the task structure. Finally, our data clearly show that, while being locally adaptive (for instance in negative valence and small magnitude contexts), state-dependent valuation comes at the cost of seemingly irrational choices, when options are extrapolated out from their original contexts. Humans often make sub-optimal decisions, choosing options that are less advantageous than available alternatives. Using computational modeling of behavior, the authors demonstrate that such irrational choices can arise from context dependence in reinforcement learning.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	2041-1723 2041-1723
DOI:	10.1038/s41467-018-06781-2