Approximation and gradient descent training with neural networks

It is well understood that neural networks with carefully hand-picked weights provide powerful function approximation and that they can be successfully trained in over-parametrized regimes. Since over-parametrization ensures zero training error, these two theories are not immediately compatible. Rec...

Full description

Saved in:

Bibliographic Details
Published in	Sampling theory, signal processing, and data analysis Vol. 23; no. 2
Main Author	Welper, Gerrit
Format	Journal Article
Language	English
Published	Cham Springer International Publishing 01.12.2025
Subjects	Abstract Harmonic Analysis Machine Learning Mathematics Mathematics and Statistics Original Article Signal,Image and Speech Processing Gradient descent 41A46 Deep neural networks Approximation 65K10 68T07 Neural tangent kernel
Online Access	Get full text

Cover

Loading…

More Information
Summary:	It is well understood that neural networks with carefully hand-picked weights provide powerful function approximation and that they can be successfully trained in over-parametrized regimes. Since over-parametrization ensures zero training error, these two theories are not immediately compatible. Recent work uses the smoothness that is required for approximation results to extend a neural tangent kernel (NTK) optimization argument to an under-parametrized regime and show direct approximation bounds for networks trained by gradient flow. Since gradient flow is only an idealization of a practical method, this paper establishes analogous results for networks trained by gradient descent.
ISSN:	2730-5716 2730-5724
DOI:	10.1007/s43670-025-00116-1