Reversal Learning in Humans and Gerbils: Dynamic Control Network Facilitates Learning

Biologically plausible modeling of behavioral reinforcement learning tasks has seen great improvements over the past decades. Less work has been dedicated to tasks involving contingency reversals, i.e., tasks in which the original behavioral goal is reversed one or multiple times. The ability to adj...

Full description

Saved in:

Bibliographic Details
Published in	Frontiers in neuroscience Vol. 10; p. 535
Main Authors	Jarvers, Christian, Brosch, Tobias, Brechmann, André, Woldeit, Marie L, Schulz, Andreas L, Ohl, Frank W, Lommerzheim, Marcel, Neumann, Heiko
Format	Journal Article
Language	English
Published	Switzerland Frontiers Research Foundation 17.11.2016 Frontiers Media S.A
Subjects	Adaptive Resonance Theory Animal cognition Basal ganglia Behavior Dopamine Efficiency Experiments Expert networks Experts Flexibility Learning Neural networks Neurobiology Neuroscience Neurosciences recurrent neural networks Reinforcement reinforcement learning Reversal Learning stability-plasticity dilemma Germany reversal learning stability-plasticity adaptive resonance theory recurrent neural networks expert networks reinforcement learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Biologically plausible modeling of behavioral reinforcement learning tasks has seen great improvements over the past decades. Less work has been dedicated to tasks involving contingency reversals, i.e., tasks in which the original behavioral goal is reversed one or multiple times. The ability to adjust to such reversals is a key element of behavioral flexibility. Here, we investigate the neural mechanisms underlying contingency-reversal tasks. We first conduct experiments with humans and gerbils to demonstrate memory effects, including multiple reversals in which subjects (humans and animals) show a faster learning rate when a previously learned contingency re-appears. Motivated by recurrent mechanisms of learning and memory for object categories, we propose a network architecture which involves reinforcement learning to steer an orienting system that monitors the success in reward acquisition. We suggest that a model sensory system provides feature representations which are further processed by category-related subnetworks which constitute a neural analog of expert networks. Categories are selected dynamically in a competitive field and predict the expected reward. Learning occurs in sequentialized phases to selectively focus the weight adaptation to synapses in the hierarchical network and modulate their weight changes by a global modulator signal. The orienting subsystem itself learns to bias the competition in the presence of continuous monotonic reward accumulation. In case of sudden changes in the discrepancy of predicted and acquired reward the activated motor category can be switched. We suggest that this subsystem is composed of a hierarchically organized network of dis-inhibitory mechanisms, dubbed a dynamic control network (DCN), which resembles components of the basal ganglia. The DCN selectively activates an expert network, corresponding to the current behavioral strategy. The trace of the accumulated reward is monitored such that large sudden deviations from the monotonicity of its evolution trigger a reset after which another expert subnetwork can be activated-if it has already been established before-or new categories can be recruited and associated with novel behavioral patterns.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Edited by: Mark Walton, University of Oxford, UK This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Neuroscience Reviewed by: Giovanni Pezzulo, National Research Council, Italy; Nicole Kristen Horst, University of Cambridge, UK
ISSN:	1662-4548 1662-453X 1662-453X
DOI:	10.3389/fnins.2016.00535