Inside Project Brainwave's Cloud-Scale, Real-Time AI Processor

Growing computational demands from deep neural networks (DNNs), coupled with diminishing returns from general-purpose architectures, have led to a proliferation of Neural Processing Units (NPUs). This paper describes the Project Brainwave NPU (BW-NPU), a parameterized microarchitecture specialized a...

Full description

Saved in:
Bibliographic Details
Published inIEEE MICRO Vol. 39; no. 3; pp. 20 - 28
Main Authors Fowers, Jeremy, Ovtcharov, Kalin, Papamichael, Michael K., Massengill, Todd, Liu, Ming, Lo, Daniel, Alkalay, Shlomi, Haselman, Michael, Adams, Logan, Ghandi, Mahdi, Heil, Stephen, Patel, Prerak, Sapek, Adam, Weisz, Gabriel, Woods, Lisa, Lanka, Sitaram, Reinhardt, Steven K., Caulfield, Adrian M., Chung, Eric S., Burger, Doug
Format Journal Article
LanguageEnglish
Published Los Alamitos IEEE 01.05.2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Growing computational demands from deep neural networks (DNNs), coupled with diminishing returns from general-purpose architectures, have led to a proliferation of Neural Processing Units (NPUs). This paper describes the Project Brainwave NPU (BW-NPU), a parameterized microarchitecture specialized at synthesis time for convolutional and recurrent DNN workloads. The BW-NPU deployed on an Intel Stratix 10 280 FPGA achieves sustained performance of 35 teraflops at a batch size of 1 on a large recurrent neural network (RNN).
ISSN:0272-1732
1937-4143
DOI:10.1109/MM.2019.2910506