Comparison of selection optimal knot using Cross Validation and Generalized Cross Validation for nonparametric regression truncated spline longitudinal data

Research on nonparametric regression is mostly done nowadays, it is more flexible and does not require assumptions like parametric regression. One of the well-known nonparametric regression modeling is the truncated spline. Truncated spline regression has advantages such as being able to model data...

Full description

Saved in:
Bibliographic Details
Published inAIP conference proceedings Vol. 3132; no. 1
Main Authors Pramudita, Ditia Tahta, Budiantara, I. Nyoman, Ratnasari, Vita
Format Journal Article Conference Proceeding
LanguageEnglish
Published Melville American Institute of Physics 07.06.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Research on nonparametric regression is mostly done nowadays, it is more flexible and does not require assumptions like parametric regression. One of the well-known nonparametric regression modeling is the truncated spline. Truncated spline regression has advantages such as being able to model data patterns that vary in behavior patterns in different sub-intervals, in addition, truncated spline regression has statistical interpretation and is easily accessible to researchers. In truncated spline regression, there are nodes that are very flexible in estimating the behavior of the data. However, in practice knot points must be selected using various methods such as Cross Validation (CV) and Generalized Cross Validation (GCV). Several researchers have developed CV and GCV methods to select optimal knot points in nonparametric cross-section regression. That research using cross-section data turns out to still contain many weaknesses. The cross-section model can only be used for one subject (one region), it cannot be used to model each subject. In fact, real data requires different modeling for each region. The cross-section model also cannot describe the behavior of the data from time to time (series). Therefore, in this study, we will examine the comparison of CV and GCV methods in choosing the optimal knot in nonparametric regression for longitudinal data using unemployment rate in Central Java with one and two knots and with the criteria of the coefficient of determination value model. The result of this research is the formulation of the CV, GCV and UBR methods which are carried out on longitudinal data. In its application to the 2012-2021 unemployment rate data in the province of Central Java, the results show that the best model by looking at the minimum knot points and the coefficient of determination is found in the CV model with two knot points with a value of 96% and a minimum knot point of 2.301×10−5.
Bibliography:ObjectType-Conference Proceeding-1
SourceType-Conference Papers & Proceedings-1
content type line 21
ISSN:0094-243X
1551-7616
DOI:10.1063/5.0211356