Monitoring Statistics and Tuning of Kernel Principal Component Analysis With Radial Basis Function Kernels

Kernel Principal Component Analysis (KPCA) using Radial Basis Function (RBF) kernels can capture data nonlinearity by projecting the original variable space to a high-dimensional kernel feature space and obtaining the kernel principal components. This article examines the tuning of the kernel width...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 8; pp. 198328 - 198342
Main Authors Tan, Ruomu, Ottewill, James R., Thornhill, Nina F.
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Kernel Principal Component Analysis (KPCA) using Radial Basis Function (RBF) kernels can capture data nonlinearity by projecting the original variable space to a high-dimensional kernel feature space and obtaining the kernel principal components. This article examines the tuning of the kernel width when using RBF kernels in KPCA, showing that inappropriate kernel widths result in RBF-KPCA being unable to capture nonlinearity present in data. The paper also considers the choice of monitoring statistics when RBF-KPCA is applied to anomaly detection. Linear PCA requires two monitoring statistics. The Hotelling's <inline-formula> <tex-math notation="LaTeX">T^{2} </tex-math></inline-formula> monitoring statistic detects when a sample exceeds the healthy operating range, while the Squared Prediction Error (SPE) monitoring statistic detects the case when the sample does not follow the model of the training data. The analysis in this article shows that SPE for RBF-KPCA can detect both cases. Moreover, unlike the case of linear PCA, the <inline-formula> <tex-math notation="LaTeX">T^{2} </tex-math></inline-formula> monitoring statistic for RBF-KPCA is non-monotonic with respect to the magnitude of the anomaly, making it not optimal as a monitoring statistic. The paper presents examples to illustrate these points. The paper also provides a detailed mathematical analysis which explains the observations from a theoretical perspective. Tuning strategies are proposed for setting the kernel width and the detection threshold of the monitoring statistic. The performance of optimally tuned RBF-KPCA for anomaly detection is demonstrated via numerical simulation and a benchmark dataset from an industrial-scale facility.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2020.3034550