Deep Neural Network Acceleration With Sparse Prediction Layers
The ever-increasing computation cost of Convolutional Neural Network (CNN) makes it imperative for real-world applications to accelerate the key steps especially the inference. In this work, we propose an efficient yet general scheme called Sparse Prediction Layer (SPL) which can predict and skip th...
Saved in:
Published in | IEEE access Vol. 8; pp. 6839 - 6848 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The ever-increasing computation cost of Convolutional Neural Network (CNN) makes it imperative for real-world applications to accelerate the key steps especially the inference. In this work, we propose an efficient yet general scheme called Sparse Prediction Layer (SPL) which can predict and skip the trivial elements in the CNN layer. Pruned weights are used to predict the locations of maximum values in max-pooling kernels and those of positive values before Rectified Linear Units (ReLUs). Thereafter, the precise values of these predicted important elements are calculated selectively and the complete outputs are restored from them. Our experiments on ImageNet Large Scale Visual Recognition Competition (ILSVRC) 2012 show that SPL can reduce 68.3%, 58.6% and 59.5% Floating-point Operations (FLOPs) on AlexNet, VGG-16 and ResNet-50, respectively, within an accuracy loss of less than 1% without retraining. The proposed SPL scheme can further accelerate these networks pruned by other pruning-based methods, such as a FLOP reduction of 50.2% on the ResNet-50 which has been pruned by Channel Pruning (CP) before being applied with SPLs. A special matrix multiplication called Sparse Result Matrix Multiplication (SRMM) is proposed to support the implementation of SPL, and its acceleration effect is in line with expectations. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2020.2963941 |