Fast Convergence of Inertial Dynamics with Hessian-Driven Damping Under Geometry Assumptions

First-order optimization algorithms can be considered as a discretization of ordinary differential equations (ODEs) (Su et al. in Adv Neural Inf Process Syst 27, 2014). In this perspective, studying the properties of the corresponding trajectories may lead to convergence results which can be transfe...

Full description

Saved in:

Bibliographic Details
Published in	Applied mathematics & optimization Vol. 88; no. 3; p. 81
Main Authors	Aujol, Jean-François, Dossal, Charles, Hoàng, Van Hao, Labarrière, Hippolyte, Rondepierre, Aude
Format	Journal Article
Language	English
Published	New York Springer US 01.12.2023 Springer Nature B.V
Subjects	Algorithms Calculus of Variations and Optimal Control; Optimization Control Convergence Convex analysis Damping Decay rate Differential equations Integral calculus Mathematical and Computational Physics Mathematical Methods in Physics Mathematics Mathematics and Statistics Numerical and Computational Physics Optimization Ordinary differential equations Simulation Systems Theory Theoretical Lyapunov analysis Convex optimization 90C25 ODEs Łojasiewicz property 65K10 Hessian-driven damping 46N10 37N40
Online Access	Get full text

Cover

Loading…

More Information
Summary:	First-order optimization algorithms can be considered as a discretization of ordinary differential equations (ODEs) (Su et al. in Adv Neural Inf Process Syst 27, 2014). In this perspective, studying the properties of the corresponding trajectories may lead to convergence results which can be transfered to the numerical scheme. In this paper we analyse the following ODE introduced by Attouch et al. (J Differ Equ 261(10):5734–5783, 2016): ∀ t ⩾ t 0 , x ¨ ( t ) + α t x ˙ ( t ) + β H F ( x ( t ) ) x ˙ ( t ) + ∇ F ( x ( t ) ) = 0 , where α > 0 , β > 0 and H F denotes the Hessian of F . This ODE can be derived to build numerical schemes which do not require F to be twice differentiable as shown in Attouch et al. (Math Program 1–43, 2020) and Attouch et al. (Optimization 72:1–40, 2021). We provide strong convergence results on the error F ( x ( t ) ) - F ∗ and integrability properties on ‖ ∇ F ( x ( t ) ) ‖ under some geometry assumptions on F such as quadratic growth around the set of minimizers. In particular, we show that the decay rate of the error for a strongly convex function is O ( t - α - ε ) for any ε > 0 . These results are briefly illustrated at the end of the paper.
ISSN:	0095-4616 1432-0606
DOI:	10.1007/s00245-023-10058-6