On semi-supervised estimation using exponential tilt mixture models

Consider a semi-supervised setting with a labeled dataset of binary responses and predictors and an unlabeled dataset with only the predictors. Logistic regression is equivalent to an exponential tilt model in the labeled population. For semi-supervised estimation of regression coefficients in logis...

Full description

Saved in:
Bibliographic Details
Published inJournal of statistical planning and inference Vol. 241; p. 106314
Main Authors Tian, Ye, Zhang, Xinwei, Tan, Zhiqiang
Format Journal Article
LanguageEnglish
Published Netherlands Elsevier B.V 01.03.2026
Subjects
Online AccessGet full text
ISSN0378-3758
DOI10.1016/j.jspi.2025.106314

Cover

More Information
Summary:Consider a semi-supervised setting with a labeled dataset of binary responses and predictors and an unlabeled dataset with only the predictors. Logistic regression is equivalent to an exponential tilt model in the labeled population. For semi-supervised estimation of regression coefficients in logistic regression, we develop further analysis and understanding of a statistical approach using exponential tilt mixture (ETM) models and maximum nonparametric likelihood estimation, while allowing that the class proportions may differ between the unlabeled and labeled data. We derive asymptotic properties of ETM-based estimation and demonstrate improved efficiency over supervised logistic regression in a random sampling setup and an outcome-stratified sampling setup previously used. Moreover, we reconcile such efficiency improvement with the existing semiparametric efficiency theory when the class proportions in the unlabeled and labeled data are restricted to be the same. We also provide a simulation study to numerically illustrate our theoretical findings. •Study semi-supervised estimation related to logistic regression.•Propose estimators based on exponential tilt mixture models and MLE.•Delineate conditions for improved efficiency over supervised estimation.•Connect the theoretical findings with the theory of semiparametric efficiency.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0378-3758
DOI:10.1016/j.jspi.2025.106314