Ridge regression combined with model complexity analysis for near infrared (NIR) spectroscopic model updating

Near infrared (NIR) calibration models can be used to predict those samples that fall into the calibration domain. However, unmodeled sources of variance within new samples, such as instrumental drift and sample variations, would result in unreliable predictions of product properties. In this case,...

Full description

Saved in:
Bibliographic Details
Published inChemometrics and intelligent laboratory systems Vol. 195; p. 103896
Main Authors Zhang, Feiyu, Zhang, Ruoqiu, Wang, Wenming, Yang, Wuye, Li, Long, Xiong, Yinran, Kang, Qidi, Du, Yiping
Format Journal Article
LanguageEnglish
Published Elsevier B.V 15.12.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Near infrared (NIR) calibration models can be used to predict those samples that fall into the calibration domain. However, unmodeled sources of variance within new samples, such as instrumental drift and sample variations, would result in unreliable predictions of product properties. In this case, the model updating approach will be very important. It involves the recalculation of model coefficients with the addition of a few new samples to the original calibration samples. Considering the cost of collecting new samples and their reference measurements, normally few samples are used for model updating. Therefore, it is necessary to balance the mutual importance of old and new samples by weighting the new samples. Compared with the weight of new samples, the model parameter in the regression method has much more influence on the performance of an updated model. The bias/variance tradeoff (L curve) has been applied to the selection of the model parameter. However, this approach contains a degree of subjectivity and does not always obtain satisfactory models. To solve the model selection problem, a new method named model complexity analysis (MCA) was proposed in this work. According to MCA, the 2-norm of the regression coefficients vector of an updated model (||β∗||2) should be smaller than that of the original model (||β||2). The ratio of ||β∗||2 over ||β||2 was defined as ε, which should be in the range of 0–1. For a given value of ε, the model parameter can be uniquely determined by the following equation: min(abs(||β∗||2−ε*||β||2)). The influence of the number of new samples and their representativity on the selection of ε was studied. In this work, ridge regression (RR) was used for model updating, because it is a regression method based on 2-norm constraint. Results show that the proposed method based on MCA could select a reliable RR parameter. RR-MCA shows excellent performance on three NIR datasets used in this work. •Model complexity analysis (MCA) was proposed to select a reliable model parameter for model updating.•According to MCA, reducing the 2-norm of model coefficients could expand the application scope of the calibration model.•Ridge regression combined with MCA (RR-MCA) shows excellent performance for model updating.
ISSN:0169-7439
1873-3239
DOI:10.1016/j.chemolab.2019.103896