On the doubt about margin explanation of boosting

Margin theory provides one of the most popular explanations to the success of AdaBoost, where the central point lies in the recognition that margin is the key for characterizing the performance of AdaBoost. This theory has been very influential, e.g., it has been used to argue that AdaBoost usually...

Full description

Saved in:

Bibliographic Details
Published in	Artificial intelligence Vol. 203; pp. 1 - 18
Main Authors	Gao, Wei, Zhou, Zhi-Hua
Format	Journal Article
Language	English
Published	Oxford Elsevier B.V 01.10.2013 Elsevier
Subjects	Applied sciences Boosting Classification Computer science; control theory; systems Data processing. List processing. Character string processing Ensemble methods Exact sciences and technology Margin theory Memory organisation. Data processing Software Boosting Ensemble methods Margin theory Classification Data analysis Automatic classification Bounded error Error bound Generalization Variance Aggregate model Bernstein polynomial Voting Supervised learning Probability learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Margin theory provides one of the most popular explanations to the success of AdaBoost, where the central point lies in the recognition that margin is the key for characterizing the performance of AdaBoost. This theory has been very influential, e.g., it has been used to argue that AdaBoost usually does not overfit since it tends to enlarge the margin even after the training error reaches zero. Previously the minimum margin bound was established for AdaBoost, however, Breiman (1999) [9] pointed out that maximizing the minimum margin does not necessarily lead to a better generalization. Later, Reyzin and Schapire (2006) [37] emphasized that the margin distribution rather than minimum margin is crucial to the performance of AdaBoost. In this paper, we first present the kth margin bound and further study on its relationship to previous work such as the minimum margin bound and Emargin bound. Then, we improve the previous empirical Bernstein bounds (Audibert et al. 2009; Maurer and Pontil, 2009) [2,30], and based on such findings, we defend the margin-based explanation against Breimanʼs doubts by proving a new generalization error bound that considers exactly the same factors as Schapire et al. (1998) [39] but is sharper than Breimanʼs (1999) [9] minimum margin bound. By incorporating factors such as average margin and variance, we present a generalization error bound that is heavily related to the whole margin distribution. We also provide margin distribution bounds for generalization error of voting classifiers in finite VC-dimension space.
ISSN:	0004-3702 1872-7921
DOI:	10.1016/j.artint.2013.07.002