Policy Adjustment in a Dynamic Economic Game

Making sequential decisions to harvest rewards is a notoriously difficult problem. One difficulty is that the real world is not stationary and the reward expected from a contemplated action may depend in complex ways on the history of an animal's choices. Previous functional neuroimaging work c...

Full description

Saved in:

Bibliographic Details
Published in	PloS one Vol. 1; no. 1; p. e103
Main Authors	Li, Jian, McClure, Samuel M., King-Casas, Brooks, Read Montague, P.
Format	Journal Article
Language	English
Published	United States Public Library of Science 20.12.2006 Public Library of Science (PLoS)
Subjects	Adult Behavior Brain Brain - physiology Choice learning Circuits Computational Biology/Computational Neuroscience Contingency Corpus Striatum - physiology Cortex (frontal) Decision making Decision Making - physiology Dopamine Emotions Female Frontal gyrus Functional magnetic resonance imaging Game theory Harvesting Humans Laboratories Learning Learning - physiology Magnetic Resonance Imaging Male Medical imaging Medicine Mental task performance Models, Economic Models, Neurological Models, Psychological Neostriatum Neuroimaging Neurology Neuroscience/Cognitive Neuroscience Neurosciences Operant conditioning Paradigms Prefrontal cortex Prefrontal Cortex - physiology Reinforcement Reinforcement (Psychology) Reward Task complexity Young Adult United States > US Texas
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Making sequential decisions to harvest rewards is a notoriously difficult problem. One difficulty is that the real world is not stationary and the reward expected from a contemplated action may depend in complex ways on the history of an animal's choices. Previous functional neuroimaging work combined with principled models has detected brain responses that correlate with computations thought to guide simple learning and action choice. Those works generally employed instrumental conditioning tasks with fixed action-reward contingencies. For real-world learning problems, the history of reward-harvesting choices can change the likelihood of rewards collected by the same choices in the near-term future. We used functional MRI to probe brain and behavioral responses in a continuous decision-making task where reward contingency is a function of both a subject's immediate choice and his choice history. In these more complex tasks, we demonstrated that a simple actor-critic model can account for both the subjects' behavioral and brain responses, and identified a reward prediction error signal in ventral striatal structures active during these non-stationary decision tasks. However, a sudden introduction of new reward structures engages more complex control circuitry in the prefrontal cortex (inferior frontal gyrus and anterior insula) and is not captured by a simple actor-critic model. Taken together, these results extend our knowledge of reward-learning signals into more complex, history-dependent choice tasks. They also highlight the important interplay between striatum and prefrontal cortex as decision-makers respond to the strategic demands imposed by non-stationary reward environments more reminiscent of real-world tasks.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 Conceived and designed the experiments: SM PM. Performed the experiments: PM JL. Analyzed the data: SM PM JL BK. Contributed reagents/materials/analysis tools: SM PM JL BK. Wrote the paper: SM PM JL BK. Current address: Department of Psychology, Princeton University, Princeton, New Jersey, United States of America
ISSN:	1932-6203 1932-6203
DOI:	10.1371/journal.pone.0000103