Adaptive Sampling for Discovery

In this paper, we study a sequential decision-making problem, called Adaptive Sampling for Discovery (ASD). Starting with a large unlabeled dataset, algorithms for ASD adaptively label the points with the goal to maximize the sum of responses. This problem has wide applications to real-world discove...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Xu, Ziping, Shim, Eunjae, Tewari, Ambuj, Zimmerman, Paul
Format	Paper
Language	English
Published	Ithaca Cornell University Library, arXiv.org 02.01.2023
Subjects	Adaptive sampling Algorithms Chemical reactions Decision making Machine learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, we study a sequential decision-making problem, called Adaptive Sampling for Discovery (ASD). Starting with a large unlabeled dataset, algorithms for ASD adaptively label the points with the goal to maximize the sum of responses. This problem has wide applications to real-world discovery problems, for example drug discovery with the help of machine learning models. ASD algorithms face the well-known exploration-exploitation dilemma. The algorithm needs to choose points that yield information to improve model estimates but it also needs to exploit the model. We rigorously formulate the problem and propose a general information-directed sampling (IDS) algorithm. We provide theoretical guarantees for the performance of IDS in linear, graph and low-rank models. The benefits of IDS are shown in both simulation experiments and real-data experiments for discovering chemical reaction conditions.
ISSN:	2331-8422