Contextual modulation of value signals in reward and punishment learning

Compared with reward seeking, punishment avoidance learning is less clearly understood at both the computational and neurobiological levels. Here we demonstrate, using computational modelling and fMRI in humans, that learning option values in a relative—context-dependent—scale offers a simple comput...

Full description

Saved in:

Bibliographic Details
Published in	Nature communications Vol. 6; no. 1; p. 8096
Main Authors	Palminteri, Stefano, Khamassi, Mehdi, Joffily, Mateus, Coricelli, Giorgio
Format	Journal Article
Language	English
Published	London Nature Publishing Group UK 25.08.2015 Nature Publishing Group Nature Pub. Group
Subjects	631/378/1457/1369 631/378/1457/1936 631/378/1595 631/378/2629/1788 Adult Avoidance Learning - physiology Bayes Theorem Brain - physiology Brain Mapping Cerebral Cortex - physiology Cognitive Sciences Computer Simulation Decision Making - physiology Economics and Finance Female Functional Neuroimaging Humanities and Social Sciences Humans Image Processing, Computer-Assisted Learning - physiology Life Sciences Magnetic Resonance Imaging Male Models, Neurological multidisciplinary Neurons and Cognition Prefrontal Cortex - physiology Punishment Reward Science Science (multidisciplinary) Ventral Striatum - physiology Young Adult reward punishment learning neuroscience biological sciences
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Compared with reward seeking, punishment avoidance learning is less clearly understood at both the computational and neurobiological levels. Here we demonstrate, using computational modelling and fMRI in humans, that learning option values in a relative—context-dependent—scale offers a simple computational solution for avoidance learning. The context (or state) value sets the reference point to which an outcome should be compared before updating the option value. Consequently, in contexts with an overall negative expected value, successful punishment avoidance acquires a positive value, thus reinforcing the response. As revealed by post-learning assessment of options values, contextual influences are enhanced when subjects are informed about the result of the forgone alternative (counterfactual information). This is mirrored at the neural level by a shift in negative outcome encoding from the anterior insula to the ventral striatum, suggesting that value contextualization also limits the need to mobilize an opponent punishment learning system. In contrast to predictions from learning theory, humans learn to seek rewards and avoid punishments equally well. Here the authors offer an elegant solution to this problem by demonstrating that humans learn option values relative to a reference point subserved by a common neural substrate.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 content type line 14 ObjectType-Feature-2 content type line 23 PMCID: PMC4560823
ISSN:	2041-1723 2041-1723
DOI:	10.1038/ncomms9096