Approximation and gradient descent training with neural networks

It is well understood that neural networks with carefully hand-picked weights provide powerful function approximation and that they can be successfully trained in over-parametrized regimes. Since over-parametrization ensures zero training error, these two theories are not immediately compatible. Rec...

Full description

Saved in:
Bibliographic Details
Published inSampling theory, signal processing, and data analysis Vol. 23; no. 2
Main Author Welper, Gerrit
Format Journal Article
LanguageEnglish
Published Cham Springer International Publishing 01.12.2025
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:It is well understood that neural networks with carefully hand-picked weights provide powerful function approximation and that they can be successfully trained in over-parametrized regimes. Since over-parametrization ensures zero training error, these two theories are not immediately compatible. Recent work uses the smoothness that is required for approximation results to extend a neural tangent kernel (NTK) optimization argument to an under-parametrized regime and show direct approximation bounds for networks trained by gradient flow. Since gradient flow is only an idealization of a practical method, this paper establishes analogous results for networks trained by gradient descent.
ISSN:2730-5716
2730-5724
DOI:10.1007/s43670-025-00116-1