Random Sketching for Neural Networks With ReLU

Training neural networks is recently a hot topic in machine learning due to its great success in many applications. Since the neural networks' training usually involves a highly nonconvex optimization problem, it is difficult to design optimization algorithms with perfect convergence guarantees...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transaction on neural networks and learning systems Vol. 32; no. 2; pp. 748 - 762
Main Authors	Wang, Di, Zeng, Jinshan, Lin, Shao-Bo
Format	Journal Article
Language	English
Published	United States IEEE 01.02.2021 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Approximation Back propagation Back propagation networks Computer applications Convergence Deep learning Design optimization Generalization error Humans Kernel Learning algorithms Least squares method Least-Squares Analysis Linear Models Machine Learning Mathematical analysis Neural networks Neural Networks, Computer Optimization random sketching rectified linear unit (ReLU) Theoretical analysis Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Training neural networks is recently a hot topic in machine learning due to its great success in many applications. Since the neural networks' training usually involves a highly nonconvex optimization problem, it is difficult to design optimization algorithms with perfect convergence guarantees to derive a neural network estimator of high quality. In this article, we borrow the well-known random sketching strategy from kernel methods to transform the training of shallow rectified linear unit (ReLU) nets into a linear least-squares problem. Using the localized approximation property of shallow ReLU nets and a recently developed dimensionality-leveraging scheme, we succeed in equipping shallow ReLU nets with a specific random sketching scheme. The efficiency of the suggested random sketching strategy is guaranteed by theoretical analysis and also verified via a series of numerical experiments. Theoretically, we show that the proposed random sketching is almost optimal in terms of both approximation capability and learning performance. This implies that random sketching does not degenerate the performance of shallow ReLU nets. Numerically, we show that random sketching can significantly reduce the computational burden of numerous backpropagation (BP) algorithms while maintaining their learning performance.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	2162-237X 2162-2388 2162-2388
DOI:	10.1109/TNNLS.2020.2979228