Prior informed regularization of recursively updated latent-variables-based models with missing observations

Many data-driven modeling techniques identify locally valid, linear representations of time-varying or nonlinear systems, and thus the model parameters must be adaptively updated as the operating conditions of the system vary, though the model identification typically does not consider prior knowled...

Full description

Saved in:
Bibliographic Details
Published inControl engineering practice Vol. 116; p. 104933
Main Authors Sun, Xiaoyu, Rashid, Mudassir, Hobbs, Nicole, Askari, Mohammad Reza, Brandt, Rachel, Shahidehpour, Andrew, Cinar, Ali
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.11.2021
Subjects
Online AccessGet full text
ISSN0967-0661
1873-6939
DOI10.1016/j.conengprac.2021.104933

Cover

More Information
Summary:Many data-driven modeling techniques identify locally valid, linear representations of time-varying or nonlinear systems, and thus the model parameters must be adaptively updated as the operating conditions of the system vary, though the model identification typically does not consider prior knowledge. In this work, we propose a new regularized partial least squares (rPLS) algorithm that incorporates prior knowledge in the model identification and can handle missing data in the independent covariates. This latent variable (LV) based modeling technique consists of three steps. First, a LV-based model is developed on the historical time series data. In the second step, the missing observations in the new incomplete data sample are estimated. Finally, the future values of the outputs are predicted as a linear combination of estimated scores and loadings. The model is recursively updated as new data are obtained from the system. The performance of the proposed rPLS and rPLS with exogenous inputs (rPLSX) algorithms are evaluated by modeling variations in glucose concentration (GC) of people with Type 1 diabetes (T1D) in response to meals and physical activities for prediction windows up to one hour, or 12 sampling instances, into the future. The proposed rPLS family of GC prediction models are evaluated with both in-silico and clinical experiment data and compared with the performance of recursive time series and kernel-based models. The root mean squared error (RMSE) with simulated subjects in the multivariable T1D simulator where physical activity effects are incorporated in GC variations are 2.52 and 5.81 mg/dL for 30 and 60 mins ahead predictions (respectively) when information for all meals and physical activities are used, increasing to 2.70 and 6.54 mg/dL (respectively) when meals and activities occurred, but the information is withheld from the modeling algorithms. The RMSE is 10.45 and 14.48 mg/dL for clinical study with prediction horizons of 30 and 60 mins, respectively. The low RMSE values demonstrate the effectiveness of the proposed rPLS approach compared to the conventional recursive modeling algorithms.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0967-0661
1873-6939
DOI:10.1016/j.conengprac.2021.104933