Distributed Heavy-Ball: A Generalization and Acceleration of First-Order Methods With Gradient Tracking
We study distributed optimization to minimize a sum of smooth and strongly-convex functions. Recent work on this problem uses gradient tracking to achieve linear convergence to the exact global minimizer. However, a connection among different approaches has been unclear. In this paper, we first show...
Saved in:
Published in | IEEE transactions on automatic control Vol. 65; no. 6; pp. 2627 - 2633 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.06.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | We study distributed optimization to minimize a sum of smooth and strongly-convex functions. Recent work on this problem uses gradient tracking to achieve linear convergence to the exact global minimizer. However, a connection among different approaches has been unclear. In this paper, we first show that many of the existing first-order algorithms are related with a simple state transformation, at the heart of which lies a recently introduced algorithm known as <inline-formula><tex-math notation="LaTeX">\mathcal {AB}</tex-math></inline-formula>. We then present distributed heavy-ball , denoted as <inline-formula><tex-math notation="LaTeX">\mathcal {AB}m</tex-math></inline-formula>, that combines <inline-formula><tex-math notation="LaTeX">\mathcal {AB}</tex-math></inline-formula> with a momentum term and uses nonidentical local step-sizes. By simultaneously implementing both row- and column-stochastic weights, <inline-formula><tex-math notation="LaTeX">\mathcal {AB}m</tex-math></inline-formula> removes the conservatism in the related work due to doubly stochastic weights or eigenvector estimation. <inline-formula><tex-math notation="LaTeX">\mathcal {AB}m</tex-math></inline-formula> thus naturally leads to optimization and average consensus over both undirected and directed graphs. We show that <inline-formula><tex-math notation="LaTeX">\mathcal {AB}m</tex-math></inline-formula> has a global <inline-formula><tex-math notation="LaTeX">R</tex-math></inline-formula>-linear rate when the largest step-size and momentum parameter are positive and sufficiently small. We numerically show that <inline-formula><tex-math notation="LaTeX">\mathcal {AB}m</tex-math></inline-formula> achieves acceleration, particularly when the objective functions are ill-conditioned. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 0018-9286 1558-2523 |
DOI: | 10.1109/TAC.2019.2942513 |