Scalable and Flexible Multiview MAX-VAR Canonical Correlation Analysis

Generalized canonical correlation analysis (GCCA) aims at finding latent low-dimensional common structure from multiple views (feature vectors in different domains) of the same entities. Unlike principal component analysis that handles a single view, (G)CCA is able to integrate information from diff...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on signal processing Vol. 65; no. 16; pp. 4150 - 4165
Main Authors	Xiao Fu, Kejun Huang, Mingyi Hong, Sidiropoulos, Nicholas D., Man-Cho So, Anthony
Format	Journal Article
Language	English
Published	New York IEEE 15.08.2017 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithm design and analysis Algorithms Canonical correlation analysis (CCA) Computer memory Computer simulation Constraints Convergence Correlation Correlation analysis Critical point Design analysis Dimensional analysis feature selection Least squares method Materials handling Mathematical models Matrices (mathematics) MAX-VAR multiview CCA Optimization Principal component analysis Principal components analysis Regularization Scalability Signal processing algorithms Speech word embedding
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Generalized canonical correlation analysis (GCCA) aims at finding latent low-dimensional common structure from multiple views (feature vectors in different domains) of the same entities. Unlike principal component analysis that handles a single view, (G)CCA is able to integrate information from different feature spaces. Here we focus on MAX-VAR GCCA, a popular formulation that has recently gained renewed interest in multilingual processing and speech modeling. The classic MAX-VAR GCCA problem can be solved optimally via eigen-decomposition of a matrix that compounds the (whitened) correlation matrices of the views; but this solution has serious scalability issues, and is not directly amenable to incorporating pertinent structural constraints such as nonnegativity and sparsity on the canonical components. We posit regularized MAX-VAR GCCA as a nonconvex optimization problem and propose an alternating optimization-based algorithm to handle it. Our algorithm alternates between inexact solutions of a regularized least squares subproblem and a manifold-constrained nonconvex subproblem, thereby achieving substantial memory and computational savings. An important benefit of our design is that it can easily handle structure-promoting regularization. We show that the algorithm globally converges to a critical point at a sublinear rate, and approaches a global optimal solution at a linear rate when no regularization is considered. Judiciously designed simulations and large-scale word embedding tasks are employed to showcase the effectiveness of the proposed algorithm.
ISSN:	1053-587X 1941-0476
DOI:	10.1109/TSP.2017.2698365