A Unified Study on Sequentiality in Universal Classification With Empirically Observed Statistics

In the binary hypothesis testing problem, it is well known that sequentiality in taking samples eradicates the trade-off between two error exponents, yet implementing the optimal test requires the knowledge of the underlying distributions, say <inline-formula> <tex-math notation="LaTeX...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on information theory Vol. 71; no. 3; pp. 1546 - 1569
Main Authors Li, Ching-Fang, Wang, I-Hsiang
Format Journal Article
LanguageEnglish
Published IEEE 01.03.2025
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In the binary hypothesis testing problem, it is well known that sequentiality in taking samples eradicates the trade-off between two error exponents, yet implementing the optimal test requires the knowledge of the underlying distributions, say <inline-formula> <tex-math notation="LaTeX">P_{0} </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">P_{1} </tex-math></inline-formula>. In the scenario where the knowledge of distributions is replaced by empirically observed statistics from the respective distributions, the gain of sequentiality is less understood when subject to universality constraints over all possible <inline-formula> <tex-math notation="LaTeX">P_{0},P_{1} </tex-math></inline-formula>. In this work, the gap is mended by a unified study on sequentiality in the universal binary classification problem, where the universality constraints are set on the expected stopping time as well as the type-I error exponent. The type-I error exponent is required to achieve a pre-set distribution-dependent constraint <inline-formula> <tex-math notation="LaTeX">\lambda (P_{0},P_{1}) </tex-math></inline-formula> for all <inline-formula> <tex-math notation="LaTeX">P_{0},P_{1} </tex-math></inline-formula>. Under the proposed framework, different sequential setups are investigated so that fair comparisons can be made with the fixed-length counterpart. By viewing these sequential classification problems as special cases of a general sequential composite hypothesis testing problem, the optimal type-II error exponents are characterized. Specifically, in the general sequential composite hypothesis testing problem subject to universality constraints, upper and lower bounds on the type-II error exponent are proved, and a sufficient condition for which the bounds coincide is given. The results for sequential classification problems are then obtained accordingly. With the characterization of the optimal error exponents, the benefit of sequentiality is shown both analytically and numerically by comparing the sequential and the fixed-length cases in representative examples of type-I exponent constraint <inline-formula> <tex-math notation="LaTeX">\lambda </tex-math></inline-formula>.
ISSN:0018-9448
1557-9654
DOI:10.1109/TIT.2024.3525012