Predicting Goal-Directed Human Attention Using Inverse Reinforcement Learning

Human gaze behavior prediction is important for behavioral vision and for computer vision applications. Most models mainly focus on predicting free-viewing behavior using saliency maps, but do not generalize to goal-directed behavior, such as when a person searches for a visual target object. We pro...

Full description

Saved in:
Bibliographic Details
Published inProceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) Vol. 2020; pp. 190 - 199
Main Authors Yang, Zhibo, Huang, Lihan, Chen, Yupei, Wei, Zijun, Ahn, Seoyoung, Zelinsky, Gregory, Samaras, Dimitris, Hoai, Minh
Format Conference Proceeding Journal Article
LanguageEnglish
Published United States IEEE 01.06.2020
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Human gaze behavior prediction is important for behavioral vision and for computer vision applications. Most models mainly focus on predicting free-viewing behavior using saliency maps, but do not generalize to goal-directed behavior, such as when a person searches for a visual target object. We propose the first inverse reinforcement learning (IRL) model to learn the internal reward function and policy used by humans during visual search. We modeled the viewer's internal belief states as dynamic contextual belief maps of object locations. These maps were learned and then used to predict behavioral scanpaths for multiple target categories. To train and evaluate our IRL model we created COCO-Search18, which is now the largest dataset of high-quality search fixations in existence. COCO-Search18 has 10 participants searching for each of 18 target-object categories in 6202 images, making about 300,000 goal-directed fixations. When trained and evaluated on COCO-Search18, the IRL model outperformed baseline models in predicting search fixation scanpaths, both in terms of similarity to human search behavior and search efficiency. Finally, reward maps recovered by the IRL model reveal distinctive target-dependent patterns of object prioritization, which we interpret as a learned object context.
AbstractList Human gaze behavior prediction is important for behavioral vision and for computer vision applications. Most models mainly focus on predicting free-viewing behavior using saliency maps, but do not generalize to goal-directed behavior, such as when a person searches for a visual target object. We propose the first inverse reinforcement learning (IRL) model to learn the internal reward function and policy used by humans during visual search. We modeled the viewer's internal belief states as dynamic contextual belief maps of object locations. These maps were learned and then used to predict behavioral scanpaths for multiple target categories. To train and evaluate our IRL model we created COCO-Search18, which is now the largest dataset of high-quality search fixations in existence. COCO-Search18 has 10 participants searching for each of 18 target-object categories in 6202 images, making about 300,000 goal-directed fixations. When trained and evaluated on COCO-Search18, the IRL model outperformed baseline models in predicting search fixation scanpaths, both in terms of similarity to human search behavior and search efficiency. Finally, reward maps recovered by the IRL model reveal distinctive target-dependent patterns of object prioritization, which we interpret as a learned object context.Human gaze behavior prediction is important for behavioral vision and for computer vision applications. Most models mainly focus on predicting free-viewing behavior using saliency maps, but do not generalize to goal-directed behavior, such as when a person searches for a visual target object. We propose the first inverse reinforcement learning (IRL) model to learn the internal reward function and policy used by humans during visual search. We modeled the viewer's internal belief states as dynamic contextual belief maps of object locations. These maps were learned and then used to predict behavioral scanpaths for multiple target categories. To train and evaluate our IRL model we created COCO-Search18, which is now the largest dataset of high-quality search fixations in existence. COCO-Search18 has 10 participants searching for each of 18 target-object categories in 6202 images, making about 300,000 goal-directed fixations. When trained and evaluated on COCO-Search18, the IRL model outperformed baseline models in predicting search fixation scanpaths, both in terms of similarity to human search behavior and search efficiency. Finally, reward maps recovered by the IRL model reveal distinctive target-dependent patterns of object prioritization, which we interpret as a learned object context.
Human gaze behavior prediction is important for behavioral vision and for computer vision applications. Most models mainly focus on predicting free-viewing behavior using saliency maps, but do not generalize to goal-directed behavior, such as when a person searches for a visual target object. We propose the first inverse reinforcement learning (IRL) model to learn the internal reward function and policy used by humans during visual search. We modeled the viewer’s internal belief states as dynamic contextual belief maps of object locations. These maps were learned and then used to predict behavioral scanpaths for multiple target categories. To train and evaluate our IRL model we created COCO-Search18, which is now the largest dataset of high-quality search fixations in existence. COCO-Search18 has 10 participants searching for each of 18 target-object categories in 6202 images, making about 300,000 goal-directed fixations. When trained and evaluated on COCO-Search18, the IRL model outperformed baseline models in predicting search fixation scanpaths, both in terms of similarity to human search behavior and search efficiency. Finally, reward maps recovered by the IRL model reveal distinctive target-dependent patterns of object prioritization, which we interpret as a learned object context.
Author Samaras, Dimitris
Zelinsky, Gregory
Wei, Zijun
Ahn, Seoyoung
Hoai, Minh
Huang, Lihan
Yang, Zhibo
Chen, Yupei
AuthorAffiliation 1 Stony Brook University
2 Adobe Inc
AuthorAffiliation_xml – name: 2 Adobe Inc
– name: 1 Stony Brook University
Author_xml – sequence: 1
  givenname: Zhibo
  surname: Yang
  fullname: Yang, Zhibo
  organization: Stony Brook University
– sequence: 2
  givenname: Lihan
  surname: Huang
  fullname: Huang, Lihan
  organization: Stony Brook University
– sequence: 3
  givenname: Yupei
  surname: Chen
  fullname: Chen, Yupei
  organization: Stony Brook University
– sequence: 4
  givenname: Zijun
  surname: Wei
  fullname: Wei, Zijun
  organization: Adobe Inc
– sequence: 5
  givenname: Seoyoung
  surname: Ahn
  fullname: Ahn, Seoyoung
  organization: Stony Brook University
– sequence: 6
  givenname: Gregory
  surname: Zelinsky
  fullname: Zelinsky, Gregory
  organization: Stony Brook University
– sequence: 7
  givenname: Dimitris
  surname: Samaras
  fullname: Samaras, Dimitris
  organization: Stony Brook University
– sequence: 8
  givenname: Minh
  surname: Hoai
  fullname: Hoai, Minh
  organization: Stony Brook University
BackLink https://www.ncbi.nlm.nih.gov/pubmed/34163124$$D View this record in MEDLINE/PubMed
BookMark eNpVUE1PAjEQrQYjiPwCjdmjl8W2W_pxMSGoSIKREPG66ZZZrNntYnch8d9bAxI8TGaS9-a9N3OBWq5ygNA1wX1CsLobvc_mjHKM-xRT3McYU3GCekpIImgowuXgFHUI5knMFVGto7mNenX9GVYSSghX8hy1E0Z4QijroJeZh6U1jXWraFzpIn6wHkwDy-h5U2oXDZsGXGMrFy3qX87EbcHXEM3BurzyBsoAR1PQ3gX4Ep3luqiht-9dtHh6fBs9x9PX8WQ0nMY2HNHEhIXcwKVWea4zY4AlNKMkMwPJcq5lQnORcSME1cs8U5kADZxrYrjhWKgs6aL7ne56k5WwNCGD10W69rbU_juttE3_I85-pKtqm0pKZKggcLsX8NXXBuomLW1toCi0g2pTp3TAmBQKMxWoN8deB5O_HwbC1Y5gAeAAKzIQnPPkB-AfhK8
CODEN IEEPAD
ContentType Conference Proceeding
Journal Article
DBID 6IE
6IH
CBEJK
RIE
RIO
NPM
7X8
5PM
DOI 10.1109/CVPR42600.2020.00027
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
PubMed
MEDLINE - Academic
PubMed Central (Full Participant titles)
DatabaseTitle PubMed
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic

PubMed

Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
Computer Science
EISBN 9781728171685
1728171687
EISSN 1063-6919
EndPage 199
ExternalDocumentID PMC8218821
34163124
9157666
Genre orig-research
Journal Article
GrantInformation_xml – fundername: NEI NIH HHS
  grantid: R01 EY030669
GroupedDBID 6IE
6IH
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
OCL
RIE
RIL
RIO
23M
29F
29O
6IK
ABDPE
ACGFS
IPLJI
M43
NPM
RIG
RNS
7X8
5PM
ID FETCH-LOGICAL-i426t-14781e68a9ffabcce432b21bc584f6a832f7b6c772adfb9b7eae66a1c6c6079b3
IEDL.DBID RIE
ISSN 1063-6919
IngestDate Thu Aug 21 18:33:07 EDT 2025
Mon Jul 21 09:55:37 EDT 2025
Wed Feb 19 02:06:27 EST 2025
Wed Aug 27 02:30:35 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i426t-14781e68a9ffabcce432b21bc584f6a832f7b6c772adfb9b7eae66a1c6c6079b3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
OpenAccessLink https://www.ncbi.nlm.nih.gov/pmc/articles/8218821
PMID 34163124
PQID 2544879049
PQPubID 23479
PageCount 10
ParticipantIDs proquest_miscellaneous_2544879049
ieee_primary_9157666
pubmedcentral_primary_oai_pubmedcentral_nih_gov_8218821
pubmed_primary_34163124
PublicationCentury 2000
PublicationDate 20200601
PublicationDateYYYYMMDD 2020-06-01
PublicationDate_xml – month: 6
  year: 2020
  text: 20200601
  day: 1
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online)
PublicationTitleAbbrev CVPR
PublicationTitleAlternate Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit
PublicationYear 2020
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0003211698
ssj0023720
Score 2.5496342
Snippet Human gaze behavior prediction is important for behavioral vision and for computer vision applications. Most models mainly focus on predicting free-viewing...
SourceID pubmedcentral
proquest
pubmed
ieee
SourceType Open Access Repository
Aggregation Database
Index Database
Publisher
StartPage 190
SubjectTerms Computational modeling
Context modeling
Learning (artificial intelligence)
Predictive models
Search problems
Task analysis
Visualization
Title Predicting Goal-Directed Human Attention Using Inverse Reinforcement Learning
URI https://ieeexplore.ieee.org/document/9157666
https://www.ncbi.nlm.nih.gov/pubmed/34163124
https://www.proquest.com/docview/2544879049
https://pubmed.ncbi.nlm.nih.gov/PMC8218821
Volume 2020
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFH5sO3nyx6bOX0TwaGfTdUl7lKEOYTKGE28lSRMdSidbe_Gv9yXtqg4PHgqFppDm5TXfe_neF4ALaqgYGG48w9GbQoY-Jwah8XDtlqlvjOCBLU4eP7DRLLx_Hjw34LKuhdFaO_KZ7tlbt5efLlRhU2VXMUV0zFgTmhi4lbVadT6lj5EMi6OqOo768dXwaTJ1-usYBQaWwOUH6zNU_oKTm6zIH8vM7TaM1x0s2SVvvSKXPfW5od343y_Ygc53QR-Z1EvVLjR0tgfbFQIllX-v2jCeLO3GjaVCk7uFePfKPyK2ccl-cp3nJT2SOKoBsSody5UmU-0EWJXLNZJKs_WlA7Pbm8fhyKsOXPDmOFC5R23dqWaRiNFIUikd9gMZUKkQpRgm0PkNl0whIBepkbHkWmjGBFVMMZ_Hsr8PrWyR6UMggeIRhnZU-akKU3wmuRpwqlKtAqFjvwttOzrJR6mpkVQD04XztWESnOd280JkelGsEiulFvEYA5ouHJSGql_uW1SJQKUL_JcJ6wZWQ_v3k2z-6rS0I4Q4eB393Z1j2LJzp6SGnUArXxb6FEFILs_c7PsCcVLfaQ
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3JTsMwEB2xHODEVqCsRuJISpwmdnJECChLq6oqiFtkOzYgUIra9MLXM3bSsIgDh0iR4kiOx5N5Hr95BjimhorIcOMZjt4UMvQ5EYXGw9gtM98YwQNbnNztsc59ePMYPc7BSV0Lo7V25DPdsrduLz8bqalNlZ0mFNExY_OwiHE_CspqrTqj0sa1DEviqj6O-snp-UN_4BTYcR0YWAqXH8xOUfkLUP7mRX4LNJcr0J11seSXvLamhWypj1_qjf_9hlVofJX0kX4drNZgTufrsFJhUFJ5-GQDuv2x3bqxZGhyNRJvXvlPxDYu3U_OiqIkSBJHNiBWp2M80WSgnQSrctlGUqm2PjXg_vJieN7xqiMXvBccqMKjtvJUs1gkaCaplA7bgQyoVIhTDBPo_oZLphCSi8zIRHItNGOCKqaYzxPZ3oSFfJTrbSCB4jEu7qjyMxVm-ExyFXGqMq0CoRO_CRt2dNL3UlUjrQamCUczw6Q40-32hcj1aDpJrZhazBNc0jRhqzRU_XLb4kqEKk3gP0xYN7Aq2j-f5C_PTk07RpCD187f3TmEpc6we5feXfdud2HZzqOSKLYHC8V4qvcRkhTywM3ETzwr4rM
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=Predicting+Goal-Directed+Human+Attention+Using+Inverse+Reinforcement+Learning&rft.au=Yang%2C+Zhibo&rft.au=Huang%2C+Lihan&rft.au=Chen%2C+Yupei&rft.au=Wei%2C+Zijun&rft.date=2020-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=190&rft.epage=199&rft_id=info:doi/10.1109%2FCVPR42600.2020.00027&rft.externalDocID=9157666
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-6919&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-6919&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-6919&client=summon