Random Sketching for Neural Networks With ReLU

Training neural networks is recently a hot topic in machine learning due to its great success in many applications. Since the neural networks' training usually involves a highly nonconvex optimization problem, it is difficult to design optimization algorithms with perfect convergence guarantees...

Full description

Saved in:
Bibliographic Details
Published inIEEE transaction on neural networks and learning systems Vol. 32; no. 2; pp. 748 - 762
Main Authors Wang, Di, Zeng, Jinshan, Lin, Shao-Bo
Format Journal Article
LanguageEnglish
Published United States IEEE 01.02.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Training neural networks is recently a hot topic in machine learning due to its great success in many applications. Since the neural networks' training usually involves a highly nonconvex optimization problem, it is difficult to design optimization algorithms with perfect convergence guarantees to derive a neural network estimator of high quality. In this article, we borrow the well-known random sketching strategy from kernel methods to transform the training of shallow rectified linear unit (ReLU) nets into a linear least-squares problem. Using the localized approximation property of shallow ReLU nets and a recently developed dimensionality-leveraging scheme, we succeed in equipping shallow ReLU nets with a specific random sketching scheme. The efficiency of the suggested random sketching strategy is guaranteed by theoretical analysis and also verified via a series of numerical experiments. Theoretically, we show that the proposed random sketching is almost optimal in terms of both approximation capability and learning performance. This implies that random sketching does not degenerate the performance of shallow ReLU nets. Numerically, we show that random sketching can significantly reduce the computational burden of numerous backpropagation (BP) algorithms while maintaining their learning performance.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2162-237X
2162-2388
2162-2388
DOI:10.1109/TNNLS.2020.2979228