Late Breaking Results: Weight Decay is ALL You Need for Neural Network Sparsification

The heuristic iterative pruning strategy has been widely used for neural network sparsification. However, it is challenging to identify the right connections to remove at each pruning iteration with only a one-shot evaluation of weight magnitude, especially at the early pruning stage. The erroneousl...

Full description

Saved in:
Bibliographic Details
Published in2023 60th ACM/IEEE Design Automation Conference (DAC) pp. 1 - 2
Main Authors Chen, Xizi, Pan, Rui, Wang, Xiaomeng, Tian, Fengshi, Tsui, Chi-Ying
Format Conference Proceeding
LanguageEnglish
Published IEEE 09.07.2023
Subjects
Online AccessGet full text
DOI10.1109/DAC56929.2023.10247950

Cover

More Information
Summary:The heuristic iterative pruning strategy has been widely used for neural network sparsification. However, it is challenging to identify the right connections to remove at each pruning iteration with only a one-shot evaluation of weight magnitude, especially at the early pruning stage. The erroneously removed connections, unfortunately, can hardly be recovered. In this work, we propose a weight decay strategy as a substitute for pruning, which let the "insignificant" weights moderately decay instead of being directly clamped to zero. At the end of the training, the vast majority of redundant weights will naturally become close to zero, making it easier to identify which connections could be removed safely. Experimental results show that the proposed weight decay method can achieve an ultra-high sparsity of 99%. Compared to the current pruning strategy, the model size is further reduced by 34%, improving the compression rate from 69× to 106× at the same accuracy.
DOI:10.1109/DAC56929.2023.10247950