Block-regularized 5\(\times\)2 Cross-validated McNemar's Test for Comparing Two Classification Algorithms
In the task of comparing two classification algorithms, the widely-used McNemar's test aims to infer the presence of a significant difference between the error rates of the two classification algorithms. However, the power of the conventional McNemar's test is usually unpromising because t...
Saved in:
Published in | arXiv.org |
---|---|
Main Authors | , |
Format | Paper |
Language | English |
Published |
Ithaca
Cornell University Library, arXiv.org
08.04.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In the task of comparing two classification algorithms, the widely-used McNemar's test aims to infer the presence of a significant difference between the error rates of the two classification algorithms. However, the power of the conventional McNemar's test is usually unpromising because the hold-out (HO) method in the test merely uses a single train-validation split that usually produces a highly varied estimation of the error rates. In contrast, a cross-validation (CV) method repeats the HO method in multiple times and produces a stable estimation. Therefore, a CV method has a great advantage to improve the power of McNemar's test. Among all types of CV methods, a block-regularized 5\(\times\)2 CV (BCV) has been shown in many previous studies to be superior to the other CV methods in the comparison task of algorithms because the 5\(\times\)2 BCV can produce a high-quality estimator of the error rate by regularizing the numbers of overlapping records between all training sets. In this study, we compress the 10 correlated contingency tables in the 5\(\times\)2 BCV to form an effective contingency table. Then, we define a 5\(\times\)2 BCV McNemar's test on the basis of the effective contingency table. We demonstrate the reasonable type I error and the promising power of the proposed 5\(\times\)2 BCV McNemar's test on multiple simulated and real-world data sets. |
---|---|
ISSN: | 2331-8422 |