Inverse Bayesian Optimization: Learning Human Acquisition Functions in an Exploration vs Exploitation Search Task

This paper introduces a probabilistic framework to estimate parameters of an acquisition function given observed human behavior that can be modeled as a collection of sample paths from a Bayesian optimization procedure. The methodology involves defining a likelihood on observed human behavior from a...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Sandholtz, Nathan, Miyamoto, Yohsuke, Bornn, Luke, Smith, Maurice
Format Paper Journal Article
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 02.02.2022
Subjects
Online AccessGet full text
ISSN2331-8422
DOI10.48550/arxiv.2104.09237

Cover

Loading…
More Information
Summary:This paper introduces a probabilistic framework to estimate parameters of an acquisition function given observed human behavior that can be modeled as a collection of sample paths from a Bayesian optimization procedure. The methodology involves defining a likelihood on observed human behavior from an optimization task, where the likelihood is parameterized by a Bayesian optimization subroutine governed by an unknown acquisition function. This structure enables us to make inference on a subject's acquisition function while allowing their behavior to deviate around the solution to the Bayesian optimization subroutine. To test our methods, we designed a sequential optimization task which forced subjects to balance exploration and exploitation in search of an invisible target location. Applying our proposed methods to the resulting data, we find that many subjects tend to exhibit exploration preferences beyond that of standard acquisition functions to capture. Guided by the model discrepancies, we augment the candidate acquisition functions to yield a superior fit to the human behavior in this task.
Bibliography:SourceType-Working Papers-1
ObjectType-Working Paper/Pre-Print-1
content type line 50
ISSN:2331-8422
DOI:10.48550/arxiv.2104.09237