Late Breaking Results: Weight Decay is ALL You Need for Neural Network Sparsification
The heuristic iterative pruning strategy has been widely used for neural network sparsification. However, it is challenging to identify the right connections to remove at each pruning iteration with only a one-shot evaluation of weight magnitude, especially at the early pruning stage. The erroneousl...
Saved in:
Published in | 2023 60th ACM/IEEE Design Automation Conference (DAC) pp. 1 - 2 |
---|---|
Main Authors | , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
09.07.2023
|
Subjects | |
Online Access | Get full text |
DOI | 10.1109/DAC56929.2023.10247950 |
Cover
Summary: | The heuristic iterative pruning strategy has been widely used for neural network sparsification. However, it is challenging to identify the right connections to remove at each pruning iteration with only a one-shot evaluation of weight magnitude, especially at the early pruning stage. The erroneously removed connections, unfortunately, can hardly be recovered. In this work, we propose a weight decay strategy as a substitute for pruning, which let the "insignificant" weights moderately decay instead of being directly clamped to zero. At the end of the training, the vast majority of redundant weights will naturally become close to zero, making it easier to identify which connections could be removed safely. Experimental results show that the proposed weight decay method can achieve an ultra-high sparsity of 99%. Compared to the current pruning strategy, the model size is further reduced by 34%, improving the compression rate from 69× to 106× at the same accuracy. |
---|---|
DOI: | 10.1109/DAC56929.2023.10247950 |