Personalized PCA: Decoupling Shared and Unique Features
Journal of Machine Learning Research 2024, 25(41):1-82 In this paper, we tackle a significant challenge in PCA: heterogeneity. When data are collected from different sources with heterogeneous trends while still sharing some congruency, it is critical to extract shared knowledge while retaining the...
Saved in:
Main Authors | , |
---|---|
Format | Journal Article |
Language | English |
Published |
16.07.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Journal of Machine Learning Research 2024, 25(41):1-82 In this paper, we tackle a significant challenge in PCA: heterogeneity. When
data are collected from different sources with heterogeneous trends while still
sharing some congruency, it is critical to extract shared knowledge while
retaining the unique features of each source. To this end, we propose
personalized PCA (PerPCA), which uses mutually orthogonal global and local
principal components to encode both unique and shared features. We show that,
under mild conditions, both unique and shared features can be identified and
recovered by a constrained optimization problem, even if the covariance
matrices are immensely different. Also, we design a fully federated algorithm
inspired by distributed Stiefel gradient descent to solve the problem. The
algorithm introduces a new group of operations called generalized retractions
to handle orthogonality constraints, and only requires global PCs to be shared
across sources. We prove the linear convergence of the algorithm under suitable
assumptions. Comprehensive numerical experiments highlight PerPCA's superior
performance in feature extraction and prediction from heterogeneous datasets.
As a systematic approach to decouple shared and unique features from
heterogeneous datasets, PerPCA finds applications in several tasks, including
video segmentation, topic extraction, and feature clustering. |
---|---|
DOI: | 10.48550/arxiv.2207.08041 |