Photonic Reconfigurable Accelerators for Efficient Inference of CNNs with Mixed-Sized Tensors

Photonic Microring Resonator (MRR) based hardware accelerators have been shown to provide disruptive speedup and energy-efficiency improvements for processing deep Convolutional Neural Networks (CNNs). However, previous MRR-based CNN accelerators fail to provide efficient adaptability for CNNs with...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on computer-aided design of integrated circuits and systems Vol. 41; no. 11; p. 1
Main Authors	Vatsavai, Sairam Sri, Thakkar, Ishan G
Format	Journal Article
Language	English
Published	New York IEEE 01.11.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Accelerator Accelerators Artificial neural networks Convolutional neural networks Costs Deep Learning Energy efficiency Frames per second Hardware Inference Kernel Mathematical analysis Photonics Reconfigurability Reconfiguration Silicon Photonics Tensors Throughput
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Photonic Microring Resonator (MRR) based hardware accelerators have been shown to provide disruptive speedup and energy-efficiency improvements for processing deep Convolutional Neural Networks (CNNs). However, previous MRR-based CNN accelerators fail to provide efficient adaptability for CNNs with mixed-sized tensors. One example of such CNNs is depthwise separable CNNs. Performing inferences of CNNs with mixed-sized tensors on such inflexible accelerators often leads to low hardware utilization, which diminishes the achievable performance and energy efficiency from the accelerators. In this paper, we present a novel way of introducing reconfigurability in the MRR-based CNN accelerators, to enable dynamic maximization of the size compatibility between the accelerator hardware components and the CNN tensors that are processed using the hardware components. We classify the state-of-the-art MRR-based CNN accelerators from prior works into two categories, based on the layout and relative placements of the utilized hardware components in the accelerators. We then use our method to introduce reconfigurability in accelerators from these two classes, to consequently improve their parallelism, flexibility of efficiently mapping tensors of different sizes, speed and overall energy efficiency. We evaluate our reconfigurable accelerators against three prior works for the area proportionate outlook (equal hardware area for all accelerators). Our evaluation for the inference of four modern CNNs indicates that our designed reconfigurable CNN accelerators provide improvements of up to 1.8× in Frames-Per-Second (FPS) and up to 1.5× in FPS/W, compared to an MRR-based accelerator from prior work.
ISSN:	0278-0070 1937-4151
DOI:	10.1109/TCAD.2022.3197538