NID: Processing Binary Convolutional Neural Network in Commodity DRAM

Recent large-scale CNNs suffer from a severe memory wall problem as their number of weights range from tens to hundreds of millions. Processing in-memory (PIM) and binary CNN have been proposed to alleviate the number of memory accesses and footprints, respectively. By combining the two separate con...

Full description

Saved in:

Bibliographic Details
Published in	2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) pp. 1 - 8
Main Authors	Sim, Jaehyeong, Seol, Hoseok, Kim, Lee-Sup
Format	Conference Proceeding
Language	English
Published	ACM 01.11.2018
Subjects	Adders Convolution Kernel Logic gates Memory management Random access memory Resource management
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Recent large-scale CNNs suffer from a severe memory wall problem as their number of weights range from tens to hundreds of millions. Processing in-memory (PIM) and binary CNN have been proposed to alleviate the number of memory accesses and footprints, respectively. By combining the two separate concepts, we propose a novel processing in-DRAM framework for binary CNN, called NID, where dominant convolution operations are processed using in-DRAM bulk bitwise operations. We first identify the problem that the bitcount operations with only bulk bitwise AND/OR/NOT incur significant overhead in terms of delay when the size of kernels gets larger. Then, we not only optimize the performance by efficiently allocating inputs and kernels to DRAM banks for both convolutional and fully-connected layers through design space explorations, but also mitigate the overhead of bitcount operations by splitting kernels into multiple parts. Partial sum accumulations and tasks of the other layers such as max-pooling and normalization layers are processed in the peripheral area of DRAM with negligible overheads. In results, our NID framework achieves 19×-36× performance and 9×-14× EDP improvements for convolutional layers, and 9×-17× performance and 1.4×-4.5× EDP improvements for fully-connected layers over previous PIM technique in four large-scale CNN models.
ISSN:	1558-2434
DOI:	10.1145/3240765.3240831