Active Learning in Performance Analysis

Active Learning (AL) is a methodology from machine learning in which the learner interacts with the data source. In this paper, we investigate application of AL techniques to a new domain: regression problems in performance analysis. For computational systems with many factors, each of which can tak...

Full description

Saved in:
Bibliographic Details
Published in2016 IEEE International Conference on Cluster Computing (CLUSTER) pp. 182 - 191
Main Authors Duplyakin, Dmitry, Brown, Jed, Ricci, Robert
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.09.2016
Subjects
Online AccessGet full text
ISSN2168-9253
DOI10.1109/CLUSTER.2016.63

Cover

More Information
Summary:Active Learning (AL) is a methodology from machine learning in which the learner interacts with the data source. In this paper, we investigate application of AL techniques to a new domain: regression problems in performance analysis. For computational systems with many factors, each of which can take on many levels, fixed experiment designs can require many experiments, and can explore the problem space inefficiently. We address these problems with a dynamic, adaptive experiment design, using AL in conjunction with Gaussian Process Regression (GPR). The performance analysis process is "seeded" with a small number of initial experiments, then GPR provides estimates of regression confidence across the full input space. AL is used to suggest follow-up experiments to run, in general, it will suggest experiments in areas where the GRP model indicates low confidence, and through repeated experiments, the process eventually achieves high confidence throughout the input space. We apply this approach to the problem of estimating performance and energy usage of HPGMG-FE, and create good-quality predictive models for the quantities of interest, with low error and reduced cost, using only a modest number of experiments. Our analysis shows that the error reduction achieved from replacing the basic AL algorithm with a cost-aware algorithm can be significant, reaching up to 38% for the same computational cost of experiments.
ISSN:2168-9253
DOI:10.1109/CLUSTER.2016.63