Stage-Wise Magnitude-Based Pruning for Recurrent Neural Networks

A recurrent neural network (RNN) has shown powerful performance in tackling various natural language processing (NLP) tasks, resulting in numerous powerful models containing both RNN neurons and feedforward neurons. On the other hand, the deep structure of RNN has heavily restricted its implementati...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transaction on neural networks and learning systems Vol. 35; no. 2; pp. 1666 - 1680
Main Authors	Li, Guiying, Yang, Peng, Qian, Chao, Hong, Richang, Tang, Ke
Format	Journal Article
Language	English
Published	United States IEEE 01.02.2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Artificial neural networks Computational modeling Deep neural network (DNN) compression Hardware language models machine translation Natural language processing Neural networks Neurons optimization Pruning Recurrent neural networks recurrent neural networks (RNNs) Sparse matrices Task analysis Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	A recurrent neural network (RNN) has shown powerful performance in tackling various natural language processing (NLP) tasks, resulting in numerous powerful models containing both RNN neurons and feedforward neurons. On the other hand, the deep structure of RNN has heavily restricted its implementation on mobile devices, where quite a few applications involve NLP tasks. Magnitude-based pruning (MP) is a promising way to address such a challenge. However, the existing MP methods are mostly designed for feedforward neural networks that do not involve a recurrent structure, and, thus, have performed less satisfactorily on pruning models containing RNN layers. In this article, a novel stage-wise MP method is proposed by explicitly taking the featured recurrent structure of RNN into account, which can effectively prune feedforward layers and RNN layers, simultaneously. The connections of neural networks are first grouped into three types according to how they are intersected with recurrent neurons. Then, an optimization-based pruning method is applied to compress each group of connections, respectively. Empirical studies show that the proposed method performs significantly better than the commonly used RNN pruning methods; i.e., up to 96.84% connections are pruned with little or even no degradation of precision indicators on the testing datasets.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	2162-237X 2162-2388
DOI:	10.1109/TNNLS.2022.3184730