Implications of Human Irrationality for Reinforcement Learning

Recent work in the behavioural sciences has begun to overturn the long-held belief that human decision making is irrational, suboptimal and subject to biases. This turn to the rational suggests that human decision making may be a better source of ideas for constraining how machine learning problems...

Full description

Saved in:
Bibliographic Details
Main Authors Chen, Haiyang, Chang, Hyung Jin, Howes, Andrew
Format Journal Article
LanguageEnglish
Published 07.06.2020
Subjects
Online AccessGet full text
DOI10.48550/arxiv.2006.04072

Cover

Loading…
Abstract Recent work in the behavioural sciences has begun to overturn the long-held belief that human decision making is irrational, suboptimal and subject to biases. This turn to the rational suggests that human decision making may be a better source of ideas for constraining how machine learning problems are defined than would otherwise be the case. One promising idea concerns human decision making that is dependent on apparently irrelevant aspects of the choice context. Previous work has shown that by taking into account choice context and making relational observations, people can maximize expected value. Other work has shown that Partially observable Markov decision processes (POMDPs) are a useful way to formulate human-like decision problems. Here, we propose a novel POMDP model for contextual choice tasks and show that, despite the apparent irrationalities, a reinforcement learner can take advantage of the way that humans make decisions. We suggest that human irrationalities may offer a productive source of inspiration for improving the design of AI architectures and machine learning methods.
AbstractList Recent work in the behavioural sciences has begun to overturn the long-held belief that human decision making is irrational, suboptimal and subject to biases. This turn to the rational suggests that human decision making may be a better source of ideas for constraining how machine learning problems are defined than would otherwise be the case. One promising idea concerns human decision making that is dependent on apparently irrelevant aspects of the choice context. Previous work has shown that by taking into account choice context and making relational observations, people can maximize expected value. Other work has shown that Partially observable Markov decision processes (POMDPs) are a useful way to formulate human-like decision problems. Here, we propose a novel POMDP model for contextual choice tasks and show that, despite the apparent irrationalities, a reinforcement learner can take advantage of the way that humans make decisions. We suggest that human irrationalities may offer a productive source of inspiration for improving the design of AI architectures and machine learning methods.
Author Chang, Hyung Jin
Chen, Haiyang
Howes, Andrew
Author_xml – sequence: 1
  givenname: Haiyang
  surname: Chen
  fullname: Chen, Haiyang
– sequence: 2
  givenname: Hyung Jin
  surname: Chang
  fullname: Chang, Hyung Jin
– sequence: 3
  givenname: Andrew
  surname: Howes
  fullname: Howes, Andrew
BackLink https://doi.org/10.48550/arXiv.2006.04072$$DView paper in arXiv
BookMark eNrjYmDJy89LZWCQNDTQM7EwNTXQTyyqyCzTMzIwMNMzMDEwN-JksPPMLcjJTE4syczPK1bIT1PwKM1NzFPwLCoCCyXmZJZUKqTlFykEpWbmAenk1NzUvBIFn9TEorzMvHQeBta0xJziVF4ozc0g7-Ya4uyhC7YpvqAoMzexqDIeZGM82EZjwioA5RU4HA
ContentType Journal Article
Copyright http://arxiv.org/licenses/nonexclusive-distrib/1.0
Copyright_xml – notice: http://arxiv.org/licenses/nonexclusive-distrib/1.0
DBID AKY
EPD
GOX
DOI 10.48550/arxiv.2006.04072
DatabaseName arXiv Computer Science
arXiv Statistics
arXiv.org
DatabaseTitleList
Database_xml – sequence: 1
  dbid: GOX
  name: arXiv.org
  url: http://arxiv.org/find
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
ExternalDocumentID 2006_04072
GroupedDBID AKY
EPD
GOX
ID FETCH-arxiv_primary_2006_040723
IEDL.DBID GOX
IngestDate Tue Jul 22 22:00:36 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-arxiv_primary_2006_040723
OpenAccessLink https://arxiv.org/abs/2006.04072
ParticipantIDs arxiv_primary_2006_04072
PublicationCentury 2000
PublicationDate 2020-06-07
PublicationDateYYYYMMDD 2020-06-07
PublicationDate_xml – month: 06
  year: 2020
  text: 2020-06-07
  day: 07
PublicationDecade 2020
PublicationYear 2020
Score 3.4538465
SecondaryResourceType preprint
Snippet Recent work in the behavioural sciences has begun to overturn the long-held belief that human decision making is irrational, suboptimal and subject to biases....
SourceID arxiv
SourceType Open Access Repository
SubjectTerms Computer Science - Artificial Intelligence
Computer Science - Learning
Computer Science - Robotics
Statistics - Machine Learning
Title Implications of Human Irrationality for Reinforcement Learning
URI https://arxiv.org/abs/2006.04072
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwY2BQSTY0TTYzSzbQNTUDjVYZmyYBy0GzFN1EM9BhUmnAbnMqeLWFn5lHqIlXhGkEE4MCbC9MYlFFZhnkfOCkYn3IXAHoDC9mBmYjI9CSLXf_CMjkJPgoLqh6hDpgGxMshFRJuAky8ENbdwqOkOgQYmBKzRNhsPNEWrWtkJ-mAB44V_AsKoIOxAEbwgrAtqNCUCr4GNNk8IidAvTk03RRBnk31xBnD12wjfEFkOMhQBc7msWDHWMsxsAC7MSnSjAomCeaA-sCCzMzE3NTk2QLkyQjYNvG3DQpJc3ANNk0zUCSQQKXKVK4paQZuIxA_T_QqIC5DANLSVFpqiywkixJkgOHFABNVGph
linkProvider Cornell University
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Implications+of+Human+Irrationality+for+Reinforcement+Learning&rft.au=Chen%2C+Haiyang&rft.au=Chang%2C+Hyung+Jin&rft.au=Howes%2C+Andrew&rft.date=2020-06-07&rft_id=info:doi/10.48550%2Farxiv.2006.04072&rft.externalDocID=2006_04072