On a Fitting of a Heaviside Function by Deep ReLU Neural Networks
A recent research interest on deep neural networks is to understand why deep networks are preferred to shallow networks. In this article, we considered an advantage of a deep structure in realizing a heaviside function in training. This is significant not only as simple classification problems but a...
Saved in:
Published in | Neural Information Processing Vol. 11301; pp. 59 - 69 |
---|---|
Main Author | |
Format | Book Chapter |
Language | English |
Published |
Switzerland
Springer International Publishing AG
2018
Springer International Publishing |
Series | Lecture Notes in Computer Science |
Subjects | |
Online Access | Get full text |
ISBN | 9783030041663 3030041662 |
ISSN | 0302-9743 1611-3349 |
DOI | 10.1007/978-3-030-04167-0_6 |
Cover
Summary: | A recent research interest on deep neural networks is to understand why deep networks are preferred to shallow networks. In this article, we considered an advantage of a deep structure in realizing a heaviside function in training. This is significant not only as simple classification problems but also as a basis in constructing general non-smooth functions. A heaviside function can be well approximated by a difference of ReLUs if we can set extremely large weight values. However, it is not so easy to attain them in training. We showed that a heaviside function can be well represented without large weight values if we employ a deep structure. We also showed that update terms of weights at input side can be necessarily large if a network is trained to realize a heaviside function. Therefore, apparent acceleration of training is brought about by setting a small learning rate. As a result, we can say that, by employing a deep structure, a good fitting of heaviside function can be obtained within a reasonable training time under a moderate small learning rate. Our results suggest that a deep structure is effective in a practical training that requires a discontinuous output. |
---|---|
ISBN: | 9783030041663 3030041662 |
ISSN: | 0302-9743 1611-3349 |
DOI: | 10.1007/978-3-030-04167-0_6 |