Block-regularized 5\(\times\)2 Cross-validated McNemar's Test for Comparing Two Classification Algorithms

In the task of comparing two classification algorithms, the widely-used McNemar's test aims to infer the presence of a significant difference between the error rates of the two classification algorithms. However, the power of the conventional McNemar's test is usually unpromising because t...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Wang, Ruibo, Li, Jihong
Format Paper
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 08.04.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In the task of comparing two classification algorithms, the widely-used McNemar's test aims to infer the presence of a significant difference between the error rates of the two classification algorithms. However, the power of the conventional McNemar's test is usually unpromising because the hold-out (HO) method in the test merely uses a single train-validation split that usually produces a highly varied estimation of the error rates. In contrast, a cross-validation (CV) method repeats the HO method in multiple times and produces a stable estimation. Therefore, a CV method has a great advantage to improve the power of McNemar's test. Among all types of CV methods, a block-regularized 5\(\times\)2 CV (BCV) has been shown in many previous studies to be superior to the other CV methods in the comparison task of algorithms because the 5\(\times\)2 BCV can produce a high-quality estimator of the error rate by regularizing the numbers of overlapping records between all training sets. In this study, we compress the 10 correlated contingency tables in the 5\(\times\)2 BCV to form an effective contingency table. Then, we define a 5\(\times\)2 BCV McNemar's test on the basis of the effective contingency table. We demonstrate the reasonable type I error and the promising power of the proposed 5\(\times\)2 BCV McNemar's test on multiple simulated and real-world data sets.
ISSN:2331-8422