Estimation and Inference for High-Dimensional Generalized Linear Models with Knowledge Transfer

Transfer learning provides a powerful tool for incorporating data from related studies into a target study of interest. In epidemiology and medical studies, the classification of a target disease could borrow information across other related diseases and populations. In this work, we consider transf...

Full description

Saved in:
Bibliographic Details
Published inJournal of the American Statistical Association Vol. 119; no. 546; pp. 1274 - 1285
Main Authors Li, Sai, Zhang, Linjun, Cai, T. Tony, Li, Hongzhe
Format Journal Article
LanguageEnglish
Published United States Taylor & Francis 02.04.2024
Taylor & Francis Ltd
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Transfer learning provides a powerful tool for incorporating data from related studies into a target study of interest. In epidemiology and medical studies, the classification of a target disease could borrow information across other related diseases and populations. In this work, we consider transfer learning for high-dimensional Generalized Linear Models (GLMs). A novel algorithm, TransHDGLM, that integrates data from the target study and the source studies is proposed. Minimax rate of convergence for estimation is established and the proposed estimator is shown to be rate-optimal. Statistical inference for the target regression coefficients is also studied. Asymptotic normality for a debiased estimator is established, which can be used for constructing coordinate-wise confidence intervals of the regression coefficients. Numerical studies show significant improvement in estimation and inference accuracy over GLMs that only use the target data. The proposed methods are applied to a real data study concerning the classification of colorectal cancer using gut microbiomes, and are shown to enhance the classification accuracy in comparison to methods that only use the target data. Supplementary materials for this article are available online.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0162-1459
1537-274X
DOI:10.1080/01621459.2023.2184373