Nonparametric Regression via Variance-Adjusted Gradient Boosting Gaussian Process Regression

Regression models have broad applications in data analytics. Gaussian process regression is a nonparametric regression model that learns nonlinear maps from input features to real-valued output using a kernel function that constructs the covariance matrix among all pairs of data. Gaussian process re...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on knowledge and data engineering Vol. 33; no. 6; pp. 2669 - 2679
Main Authors	Lu, Hsin-Min, Chen, Jih-Shin, Liao, Wei-Chun
Format	Journal Article
Language	English
Published	New York IEEE 01.06.2021 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms big data Boosting Complexity Computational modeling Covariance matrix Data models Datasets Gaussian process Gaussian processes Gradient boosting Ground penetrating radar Kernel functions Nonparametric statistics Predictive models Regression models Training Variance variance-adjusted prediction
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Regression models have broad applications in data analytics. Gaussian process regression is a nonparametric regression model that learns nonlinear maps from input features to real-valued output using a kernel function that constructs the covariance matrix among all pairs of data. Gaussian process regression often performs well in various applications. However, the time complexity of Gaussian process regression is <inline-formula><tex-math notation="LaTeX">O(n^3)</tex-math> <mml:math><mml:mrow><mml:mi>O</mml:mi><mml:mo>(</mml:mo><mml:msup><mml:mi>n</mml:mi><mml:mn>3</mml:mn></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="lu-ieq1-2953728.gif"/> </inline-formula> for a training dataset of size <inline-formula><tex-math notation="LaTeX">n</tex-math> <mml:math><mml:mi>n</mml:mi></mml:math><inline-graphic xlink:href="lu-ieq2-2953728.gif"/> </inline-formula>. The cubic time complexity hinders Gaussian process regression from scaling up to large datasets. Guided by the properties of Gaussian distributions, we developed a variance-adjusted gradient boosting algorithm for approximating a Gaussian process regression (VAGR). VAGR sequentially approximates the full Gaussian process regression model using the residuals computed from variance-adjusted predictions based on randomly sampled training subsets. VAGR has a time complexity of <inline-formula><tex-math notation="LaTeX">O(nm^3)</tex-math> <mml:math><mml:mrow><mml:mi>O</mml:mi><mml:mo>(</mml:mo><mml:mi>n</mml:mi><mml:msup><mml:mi>m</mml:mi><mml:mn>3</mml:mn></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="lu-ieq3-2953728.gif"/> </inline-formula> for a training dataset of size <inline-formula><tex-math notation="LaTeX">n</tex-math> <mml:math><mml:mi>n</mml:mi></mml:math><inline-graphic xlink:href="lu-ieq4-2953728.gif"/> </inline-formula> and the chosen batch size <inline-formula><tex-math notation="LaTeX">m</tex-math> <mml:math><mml:mi>m</mml:mi></mml:math><inline-graphic xlink:href="lu-ieq5-2953728.gif"/> </inline-formula>. The reduced time complexity allows us to apply VAGR to much larger datasets compared with the full Gaussian process regression. Our experiments suggest that VAGR has a prediction performance comparable to or better than models that include random forest, gradient boosting machines, support vector regressions, and stochastic variational inference for Gaussian process regression.
ISSN:	1041-4347 1558-2191
DOI:	10.1109/TKDE.2019.2953728