Accelerated Distributed Nesterov Gradient Descent

This paper considers the distributed optimization problem over a network, where the objective is to optimize a global function formed by a sum of local functions, using only local computation and communication. We develop an accelerated distributed Nesterov gradient descent method. When the objectiv...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on automatic control Vol. 65; no. 6; pp. 2566 - 2581
Main Authors	Qu, Guannan, Li, Na
Format	Journal Article
Language	English
Published	New York IEEE 01.06.2020 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Acceleration Convergence Convex functions Distributed algorithms distributed optimization Gradient methods Linear programming multiagent systems Optimization optimization methods Radio frequency
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper considers the distributed optimization problem over a network, where the objective is to optimize a global function formed by a sum of local functions, using only local computation and communication. We develop an accelerated distributed Nesterov gradient descent method. When the objective function is convex and <inline-formula><tex-math notation="LaTeX">L</tex-math></inline-formula>-smooth, we show that it achieves a <inline-formula><tex-math notation="LaTeX">O(\frac{1}{t^{1.4-\epsilon }})</tex-math></inline-formula> convergence rate for all <inline-formula><tex-math notation="LaTeX">\epsilon \in (0,1.4)</tex-math></inline-formula>. We also show the convergence rate can be improved to <inline-formula><tex-math notation="LaTeX">O(\frac{1}{t^2})</tex-math></inline-formula> if the objective function is a composition of a linear map and a strongly convex and smooth function. When the objective function is <inline-formula><tex-math notation="LaTeX">\mu</tex-math></inline-formula>-strongly convex and <inline-formula><tex-math notation="LaTeX">L</tex-math></inline-formula>-smooth, we show that it achieves a linear convergence rate of <inline-formula><tex-math notation="LaTeX">O([ 1 - C (\frac{\mu }{L})^{5/7} ]^t)</tex-math></inline-formula>, where <inline-formula><tex-math notation="LaTeX">\frac{L}{\mu }</tex-math></inline-formula> is the condition number of the objective, and <inline-formula><tex-math notation="LaTeX">C>0</tex-math></inline-formula> is some constant that does not depend on <inline-formula><tex-math notation="LaTeX">\frac{L}{\mu }</tex-math></inline-formula>.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0018-9286 1558-2523
DOI:	10.1109/TAC.2019.2937496