Active Learning in Performance Analysis

Active Learning (AL) is a methodology from machine learning in which the learner interacts with the data source. In this paper, we investigate application of AL techniques to a new domain: regression problems in performance analysis. For computational systems with many factors, each of which can tak...

Full description

Saved in:

Bibliographic Details
Published in	2016 IEEE International Conference on Cluster Computing (CLUSTER) pp. 182 - 191
Main Authors	Duplyakin, Dmitry, Brown, Jed, Ricci, Robert
Format	Conference Proceeding
Language	English
Published	IEEE 01.09.2016
Subjects	Active Learning Algorithm design and analysis Data models Gaussian Process Regression Ground penetrating radar Measurement Performance analysis Prediction algorithms Prediction Confidence Predictive models
Online Access	Get full text
ISSN	2168-9253
DOI	10.1109/CLUSTER.2016.63

Cover

More Information
Summary:	Active Learning (AL) is a methodology from machine learning in which the learner interacts with the data source. In this paper, we investigate application of AL techniques to a new domain: regression problems in performance analysis. For computational systems with many factors, each of which can take on many levels, fixed experiment designs can require many experiments, and can explore the problem space inefficiently. We address these problems with a dynamic, adaptive experiment design, using AL in conjunction with Gaussian Process Regression (GPR). The performance analysis process is "seeded" with a small number of initial experiments, then GPR provides estimates of regression confidence across the full input space. AL is used to suggest follow-up experiments to run, in general, it will suggest experiments in areas where the GRP model indicates low confidence, and through repeated experiments, the process eventually achieves high confidence throughout the input space. We apply this approach to the problem of estimating performance and energy usage of HPGMG-FE, and create good-quality predictive models for the quantities of interest, with low error and reduced cost, using only a modest number of experiments. Our analysis shows that the error reduction achieved from replacing the basic AL algorithm with a cost-aware algorithm can be significant, reaching up to 38% for the same computational cost of experiments.
ISSN:	2168-9253
DOI:	10.1109/CLUSTER.2016.63