Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation

Classification is an essential task for predicting the class values of new instances. Both k-fold and leave-one-out cross validation are very popular for evaluating the performance of classification algorithms. Many data mining literatures introduce the operations for these two kinds of cross valida...

Full description

Saved in:

Bibliographic Details
Published in	Pattern recognition Vol. 48; no. 9; pp. 2839 - 2846
Main Author	Wong, Tzu-Tsung
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.09.2015
Subjects	Classification Independence k-Fold cross validation Leave-one-out cross validation Sampling distribution k-Fold cross validation Classification Independence Leave-one-out cross validation Sampling distribution
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Classification is an essential task for predicting the class values of new instances. Both k-fold and leave-one-out cross validation are very popular for evaluating the performance of classification algorithms. Many data mining literatures introduce the operations for these two kinds of cross validation and the statistical methods that can be used to analyze the resulting accuracies of algorithms, while those contents are generally not all consistent. Analysts can therefore be confused in performing a cross validation procedure. In this paper, the independence assumptions in cross validation are introduced, and the circumstances that satisfy the assumptions are also addressed. The independence assumptions are then used to derive the sampling distributions of the point estimators for k-fold and leave-one-out cross validation. The cross validation procedure to have such sampling distributions is discussed to provide new insights in evaluating the performance of classification algorithms. •The definition of independence assumptions is proposed and discussed.•The sampling distributions for k-fold and leave-one-out cross validation are derived.•New insights in evaluating the performance of classification algorithms are provided.
ISSN:	0031-3203 1873-5142
DOI:	10.1016/j.patcog.2015.03.009