Discounting of reward sequences: a test of competing formal models of hyperbolic discounting

Humans are known to discount future rewards hyperbolically in time. Nevertheless, a formal recursive model of hyperbolic discounting has been elusive until recently, with the introduction of the hyperbolically discounted temporal difference (HDTD) model. Prior to that, models of learning (especially...

Full description

Saved in:

Bibliographic Details
Published in	Frontiers in psychology Vol. 5; p. 178
Main Authors	Zarr, Noah, Alexander, William H., Brown, Joshua W.
Format	Journal Article
Language	English
Published	Switzerland Frontiers Media S.A 06.03.2014
Subjects	Behavioral Research discounting Exponential discounting hyperbolic discounting model fitting Psychology temporal difference learning behavioral research recursive model temporal difference learning discounting Parallel model exponential discounting hyperbolic discounting model fitting
Online Access	Get full text
ISSN	1664-1078 1664-1078
DOI	10.3389/fpsyg.2014.00178

Cover

More Information
Summary:	Humans are known to discount future rewards hyperbolically in time. Nevertheless, a formal recursive model of hyperbolic discounting has been elusive until recently, with the introduction of the hyperbolically discounted temporal difference (HDTD) model. Prior to that, models of learning (especially reinforcement learning) have relied on exponential discounting, which generally provides poorer fits to behavioral data. Recently, it has been shown that hyperbolic discounting can also be approximated by a summed distribution of exponentially discounted values, instantiated in the μAgents model. The HDTD model and the μAgents model differ in one key respect, namely how they treat sequences of rewards. The μAgents model is a particular implementation of a Parallel discounting model, which values sequences based on the summed value of the individual rewards whereas the HDTD model contains a non-linear interaction. To discriminate among these models, we observed how subjects discounted a sequence of three rewards, and then we tested how well each candidate model fit the subject data. The results show that the Parallel model generally provides a better fit to the human data.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Edited by: Philip Beaman, University of Reading, UK This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology. Reviewed by: Zheng Wang, Ohio State University, USA; Timothy Pleskac, Michigan State University, USA
ISSN:	1664-1078 1664-1078
DOI:	10.3389/fpsyg.2014.00178