A Novel Family of Boosted Online Regression Algorithms with Strong Theoretical Bounds

We investigate boosted online regression and propose a novel family of regression algorithms with strong theoretical bounds. In addition, we implement several variants of the proposed generic algorithm. We specifically provide theoretical bounds for the performance of our proposed algorithms that ho...

Full description

Saved in:
Bibliographic Details
Main Authors Kari, Dariush, Khan, Farhan, Ciftci, Selami, Kozat, Suleyman Serdar
Format Journal Article
LanguageEnglish
Published 04.01.2016
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We investigate boosted online regression and propose a novel family of regression algorithms with strong theoretical bounds. In addition, we implement several variants of the proposed generic algorithm. We specifically provide theoretical bounds for the performance of our proposed algorithms that hold in a strong mathematical sense. We achieve guaranteed performance improvement over the conventional online regression methods without any statistical assumptions on the desired data or feature vectors. We demonstrate an intrinsic relationship, in terms of boosting, between the adaptive mixture-of-experts and data reuse algorithms. Furthermore, we introduce a boosting algorithm based on random updates that is significantly faster than the conventional boosting methods and other variants of our proposed algorithms while achieving an enhanced performance gain. Hence, the random updates method is specifically applicable to the fast and high dimensional streaming data. Specifically, we investigate Newton Method-based and Stochastic Gradient Descent-based linear regression algorithms in a mixture-of-experts setting and provide several variants of these well-known adaptation methods. However, the proposed algorithms can be extended to other base learners, e.g., nonlinear, tree-based piecewise linear. Furthermore, we provide theoretical bounds for the computational complexity of our proposed algorithms. We demonstrate substantial performance gains in terms of mean square error over the base learners through an extensive set of benchmark real data sets and simulated examples.
DOI:10.48550/arxiv.1601.00549