Predicting Goal-Directed Human Attention Using Inverse Reinforcement Learning
Human gaze behavior prediction is important for behavioral vision and for computer vision applications. Most models mainly focus on predicting free-viewing behavior using saliency maps, but do not generalize to goal-directed behavior, such as when a person searches for a visual target object. We pro...
Saved in:
Published in | Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) Vol. 2020; pp. 190 - 199 |
---|---|
Main Authors | , , , , , , , |
Format | Conference Proceeding Journal Article |
Language | English |
Published |
United States
IEEE
01.06.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Human gaze behavior prediction is important for behavioral vision and for computer vision applications. Most models mainly focus on predicting free-viewing behavior using saliency maps, but do not generalize to goal-directed behavior, such as when a person searches for a visual target object. We propose the first inverse reinforcement learning (IRL) model to learn the internal reward function and policy used by humans during visual search. We modeled the viewer's internal belief states as dynamic contextual belief maps of object locations. These maps were learned and then used to predict behavioral scanpaths for multiple target categories. To train and evaluate our IRL model we created COCO-Search18, which is now the largest dataset of high-quality search fixations in existence. COCO-Search18 has 10 participants searching for each of 18 target-object categories in 6202 images, making about 300,000 goal-directed fixations. When trained and evaluated on COCO-Search18, the IRL model outperformed baseline models in predicting search fixation scanpaths, both in terms of similarity to human search behavior and search efficiency. Finally, reward maps recovered by the IRL model reveal distinctive target-dependent patterns of object prioritization, which we interpret as a learned object context. |
---|---|
AbstractList | Human gaze behavior prediction is important for behavioral vision and for computer vision applications. Most models mainly focus on predicting free-viewing behavior using saliency maps, but do not generalize to goal-directed behavior, such as when a person searches for a visual target object. We propose the first inverse reinforcement learning (IRL) model to learn the internal reward function and policy used by humans during visual search. We modeled the viewer's internal belief states as dynamic contextual belief maps of object locations. These maps were learned and then used to predict behavioral scanpaths for multiple target categories. To train and evaluate our IRL model we created COCO-Search18, which is now the largest dataset of high-quality search fixations in existence. COCO-Search18 has 10 participants searching for each of 18 target-object categories in 6202 images, making about 300,000 goal-directed fixations. When trained and evaluated on COCO-Search18, the IRL model outperformed baseline models in predicting search fixation scanpaths, both in terms of similarity to human search behavior and search efficiency. Finally, reward maps recovered by the IRL model reveal distinctive target-dependent patterns of object prioritization, which we interpret as a learned object context.Human gaze behavior prediction is important for behavioral vision and for computer vision applications. Most models mainly focus on predicting free-viewing behavior using saliency maps, but do not generalize to goal-directed behavior, such as when a person searches for a visual target object. We propose the first inverse reinforcement learning (IRL) model to learn the internal reward function and policy used by humans during visual search. We modeled the viewer's internal belief states as dynamic contextual belief maps of object locations. These maps were learned and then used to predict behavioral scanpaths for multiple target categories. To train and evaluate our IRL model we created COCO-Search18, which is now the largest dataset of high-quality search fixations in existence. COCO-Search18 has 10 participants searching for each of 18 target-object categories in 6202 images, making about 300,000 goal-directed fixations. When trained and evaluated on COCO-Search18, the IRL model outperformed baseline models in predicting search fixation scanpaths, both in terms of similarity to human search behavior and search efficiency. Finally, reward maps recovered by the IRL model reveal distinctive target-dependent patterns of object prioritization, which we interpret as a learned object context. Human gaze behavior prediction is important for behavioral vision and for computer vision applications. Most models mainly focus on predicting free-viewing behavior using saliency maps, but do not generalize to goal-directed behavior, such as when a person searches for a visual target object. We propose the first inverse reinforcement learning (IRL) model to learn the internal reward function and policy used by humans during visual search. We modeled the viewer’s internal belief states as dynamic contextual belief maps of object locations. These maps were learned and then used to predict behavioral scanpaths for multiple target categories. To train and evaluate our IRL model we created COCO-Search18, which is now the largest dataset of high-quality search fixations in existence. COCO-Search18 has 10 participants searching for each of 18 target-object categories in 6202 images, making about 300,000 goal-directed fixations. When trained and evaluated on COCO-Search18, the IRL model outperformed baseline models in predicting search fixation scanpaths, both in terms of similarity to human search behavior and search efficiency. Finally, reward maps recovered by the IRL model reveal distinctive target-dependent patterns of object prioritization, which we interpret as a learned object context. |
Author | Samaras, Dimitris Zelinsky, Gregory Wei, Zijun Ahn, Seoyoung Hoai, Minh Huang, Lihan Yang, Zhibo Chen, Yupei |
AuthorAffiliation | 1 Stony Brook University 2 Adobe Inc |
AuthorAffiliation_xml | – name: 2 Adobe Inc – name: 1 Stony Brook University |
Author_xml | – sequence: 1 givenname: Zhibo surname: Yang fullname: Yang, Zhibo organization: Stony Brook University – sequence: 2 givenname: Lihan surname: Huang fullname: Huang, Lihan organization: Stony Brook University – sequence: 3 givenname: Yupei surname: Chen fullname: Chen, Yupei organization: Stony Brook University – sequence: 4 givenname: Zijun surname: Wei fullname: Wei, Zijun organization: Adobe Inc – sequence: 5 givenname: Seoyoung surname: Ahn fullname: Ahn, Seoyoung organization: Stony Brook University – sequence: 6 givenname: Gregory surname: Zelinsky fullname: Zelinsky, Gregory organization: Stony Brook University – sequence: 7 givenname: Dimitris surname: Samaras fullname: Samaras, Dimitris organization: Stony Brook University – sequence: 8 givenname: Minh surname: Hoai fullname: Hoai, Minh organization: Stony Brook University |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/34163124$$D View this record in MEDLINE/PubMed |
BookMark | eNpVUE1PAjEQrQYjiPwCjdmjl8W2W_pxMSGoSIKREPG66ZZZrNntYnch8d9bAxI8TGaS9-a9N3OBWq5ygNA1wX1CsLobvc_mjHKM-xRT3McYU3GCekpIImgowuXgFHUI5knMFVGto7mNenX9GVYSSghX8hy1E0Z4QijroJeZh6U1jXWraFzpIn6wHkwDy-h5U2oXDZsGXGMrFy3qX87EbcHXEM3BurzyBsoAR1PQ3gX4Ep3luqiht-9dtHh6fBs9x9PX8WQ0nMY2HNHEhIXcwKVWea4zY4AlNKMkMwPJcq5lQnORcSME1cs8U5kADZxrYrjhWKgs6aL7ne56k5WwNCGD10W69rbU_juttE3_I85-pKtqm0pKZKggcLsX8NXXBuomLW1toCi0g2pTp3TAmBQKMxWoN8deB5O_HwbC1Y5gAeAAKzIQnPPkB-AfhK8 |
CODEN | IEEPAD |
ContentType | Conference Proceeding Journal Article |
DBID | 6IE 6IH CBEJK RIE RIO NPM 7X8 5PM |
DOI | 10.1109/CVPR42600.2020.00027 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present PubMed MEDLINE - Academic PubMed Central (Full Participant titles) |
DatabaseTitle | PubMed MEDLINE - Academic |
DatabaseTitleList | MEDLINE - Academic PubMed |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Applied Sciences Computer Science |
EISBN | 9781728171685 1728171687 |
EISSN | 1063-6919 |
EndPage | 199 |
ExternalDocumentID | PMC8218821 34163124 9157666 |
Genre | orig-research Journal Article |
GrantInformation_xml | – fundername: NEI NIH HHS grantid: R01 EY030669 |
GroupedDBID | 6IE 6IH 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP OCL RIE RIL RIO 23M 29F 29O 6IK ABDPE ACGFS IPLJI M43 NPM RIG RNS 7X8 5PM |
ID | FETCH-LOGICAL-i426t-14781e68a9ffabcce432b21bc584f6a832f7b6c772adfb9b7eae66a1c6c6079b3 |
IEDL.DBID | RIE |
ISSN | 1063-6919 |
IngestDate | Thu Aug 21 18:33:07 EDT 2025 Mon Jul 21 09:55:37 EDT 2025 Wed Feb 19 02:06:27 EST 2025 Wed Aug 27 02:30:35 EDT 2025 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i426t-14781e68a9ffabcce432b21bc584f6a832f7b6c772adfb9b7eae66a1c6c6079b3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
OpenAccessLink | https://www.ncbi.nlm.nih.gov/pmc/articles/8218821 |
PMID | 34163124 |
PQID | 2544879049 |
PQPubID | 23479 |
PageCount | 10 |
ParticipantIDs | proquest_miscellaneous_2544879049 ieee_primary_9157666 pubmedcentral_primary_oai_pubmedcentral_nih_gov_8218821 pubmed_primary_34163124 |
PublicationCentury | 2000 |
PublicationDate | 20200601 |
PublicationDateYYYYMMDD | 2020-06-01 |
PublicationDate_xml | – month: 6 year: 2020 text: 20200601 day: 1 |
PublicationDecade | 2020 |
PublicationPlace | United States |
PublicationPlace_xml | – name: United States |
PublicationTitle | Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) |
PublicationTitleAbbrev | CVPR |
PublicationTitleAlternate | Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit |
PublicationYear | 2020 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0003211698 ssj0023720 |
Score | 2.5496342 |
Snippet | Human gaze behavior prediction is important for behavioral vision and for computer vision applications. Most models mainly focus on predicting free-viewing... |
SourceID | pubmedcentral proquest pubmed ieee |
SourceType | Open Access Repository Aggregation Database Index Database Publisher |
StartPage | 190 |
SubjectTerms | Computational modeling Context modeling Learning (artificial intelligence) Predictive models Search problems Task analysis Visualization |
Title | Predicting Goal-Directed Human Attention Using Inverse Reinforcement Learning |
URI | https://ieeexplore.ieee.org/document/9157666 https://www.ncbi.nlm.nih.gov/pubmed/34163124 https://www.proquest.com/docview/2544879049 https://pubmed.ncbi.nlm.nih.gov/PMC8218821 |
Volume | 2020 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFH5sO3nyx6bOX0TwaGfTdUl7lKEOYTKGE28lSRMdSidbe_Gv9yXtqg4PHgqFppDm5TXfe_neF4ALaqgYGG48w9GbQoY-Jwah8XDtlqlvjOCBLU4eP7DRLLx_Hjw34LKuhdFaO_KZ7tlbt5efLlRhU2VXMUV0zFgTmhi4lbVadT6lj5EMi6OqOo768dXwaTJ1-usYBQaWwOUH6zNU_oKTm6zIH8vM7TaM1x0s2SVvvSKXPfW5od343y_Ygc53QR-Z1EvVLjR0tgfbFQIllX-v2jCeLO3GjaVCk7uFePfKPyK2ccl-cp3nJT2SOKoBsSody5UmU-0EWJXLNZJKs_WlA7Pbm8fhyKsOXPDmOFC5R23dqWaRiNFIUikd9gMZUKkQpRgm0PkNl0whIBepkbHkWmjGBFVMMZ_Hsr8PrWyR6UMggeIRhnZU-akKU3wmuRpwqlKtAqFjvwttOzrJR6mpkVQD04XztWESnOd280JkelGsEiulFvEYA5ouHJSGql_uW1SJQKUL_JcJ6wZWQ_v3k2z-6rS0I4Q4eB393Z1j2LJzp6SGnUArXxb6FEFILs_c7PsCcVLfaQ |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3JTsMwEB2xHODEVqCsRuJISpwmdnJECChLq6oqiFtkOzYgUIra9MLXM3bSsIgDh0iR4kiOx5N5Hr95BjimhorIcOMZjt4UMvQ5EYXGw9gtM98YwQNbnNztsc59ePMYPc7BSV0Lo7V25DPdsrduLz8bqalNlZ0mFNExY_OwiHE_CspqrTqj0sa1DEviqj6O-snp-UN_4BTYcR0YWAqXH8xOUfkLUP7mRX4LNJcr0J11seSXvLamhWypj1_qjf_9hlVofJX0kX4drNZgTufrsFJhUFJ5-GQDuv2x3bqxZGhyNRJvXvlPxDYu3U_OiqIkSBJHNiBWp2M80WSgnQSrctlGUqm2PjXg_vJieN7xqiMXvBccqMKjtvJUs1gkaCaplA7bgQyoVIhTDBPo_oZLphCSi8zIRHItNGOCKqaYzxPZ3oSFfJTrbSCB4jEu7qjyMxVm-ExyFXGqMq0CoRO_CRt2dNL3UlUjrQamCUczw6Q40-32hcj1aDpJrZhazBNc0jRhqzRU_XLb4kqEKk3gP0xYN7Aq2j-f5C_PTk07RpCD187f3TmEpc6we5feXfdud2HZzqOSKLYHC8V4qvcRkhTywM3ETzwr4rM |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=Predicting+Goal-Directed+Human+Attention+Using+Inverse+Reinforcement+Learning&rft.au=Yang%2C+Zhibo&rft.au=Huang%2C+Lihan&rft.au=Chen%2C+Yupei&rft.au=Wei%2C+Zijun&rft.date=2020-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=190&rft.epage=199&rft_id=info:doi/10.1109%2FCVPR42600.2020.00027&rft.externalDocID=9157666 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-6919&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-6919&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-6919&client=summon |