Mixed effect gradient boosting for high-dimensional longitudinal data
High-dimensional longitudinal data present significant analytical challenges due to intricate within-subject correlations and an overwhelming ratio of predictors to observations. To address these challenges, we introduce Mixed-Effect Gradient Boosting (MEGB), a novel R package that synergises gradie...
Saved in:
Published in | Scientific reports Vol. 15; no. 1; pp. 30927 - 24 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
London
Nature Publishing Group UK
22.08.2025
Nature Publishing Group Nature Portfolio |
Subjects | |
Online Access | Get full text |
ISSN | 2045-2322 2045-2322 |
DOI | 10.1038/s41598-025-16526-z |
Cover
Loading…
Summary: | High-dimensional longitudinal data present significant analytical challenges due to intricate within-subject correlations and an overwhelming ratio of predictors to observations. To address these challenges, we introduce Mixed-Effect Gradient Boosting (MEGB), a novel R package that synergises gradient boosting with mixed-effects modelling to simultaneously account for population-level fixed effects and subject-specific random variability. MEGB provides a unified framework for analysing repeated measures data that accommodates complex covariance structures while harnessing gradient boosting’s inherent regularisation for robust feature selection and prediction. In comprehensive simulations spanning linear and nonlinear data-generating processes, MEGB achieved 35-76% lower mean squared error (MSE) compared to state-of-the-art alternatives like Mixed-Effect Random Forests (MERF) and REEMForest, while maintaining 55-70% true positive rates for variable selection in ultra-high-dimensional regimes
(
p
=
2000
)
. Demonstrating practical utility, we applied MEGB to maternal cell-free plasma RNA data
(
n
=
12
subjects,
p
=
33
,
297
transcripts), where it identified 9 key placental transcripts driving fetal RNA dynamics across pregnancy trimesters. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
ISSN: | 2045-2322 2045-2322 |
DOI: | 10.1038/s41598-025-16526-z |