Decentralized Frank-Wolfe Algorithm for Convex and Nonconvex Problems

Decentralized optimization algorithms have received much attention due to the recent advances in network information processing. However, conventional decentralized algorithms based on projected gradient descent are incapable of handling high-dimensional constrained problems, as the projection step...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on automatic control Vol. 62; no. 11; pp. 5522 - 5537
Main Authors Hoi-To Wai, Lafond, Jean, Scaglione, Anna, Moulines, Eric
Format Journal Article
LanguageEnglish
Published IEEE 01.11.2017
Institute of Electrical and Electronics Engineers
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Decentralized optimization algorithms have received much attention due to the recent advances in network information processing. However, conventional decentralized algorithms based on projected gradient descent are incapable of handling high-dimensional constrained problems, as the projection step becomes computationally prohibitive. To address this problem, this paper adopts a projection-free optimization approach, a.k.a. the Frank-Wolfe (FW) or conditional gradient algorithm. We first develop a decentralized FW (DeFW) algorithm from the classical FW algorithm. The convergence of the proposed algorithm is studied by viewing the decentralized algorithm as an inexact FW algorithm. Using a diminishing step size rule and letting t be the iteration number, we show that the DeFW algorithm's convergence rate is O(1/t) for convex objectives; is O(1/t 2 ) for strongly convex objectives with the optimal solution in the interior of the constraint set; and is O(1/√t) toward a stationary point for smooth but nonconvex objectives. We then show that a consensus-based DeFW algorithm meets the above guarantees with two communication rounds per iteration. We demonstrate the advantages of the proposed DeFW algorithm on low-complexity robust matrix completion and communication efficient sparse learning. Numerical results on synthetic and real data are presented to support our findings.
ISSN:0018-9286
1558-2523
DOI:10.1109/TAC.2017.2685559