DRIPS: Dynamic Rebalancing of Pipelined Streaming Applications on CGRAs

Coarse-grained reconfigurable arrays (CGRAs) provide higher flexibility than application-specific integrated circuits (ASICs) and higher efficiency than fine-grained reconfigurable devices such as Field Programmable Gate Arrays (FPGAs). However, CGRAs are generally designed to support offloading of...

Full description

Saved in:

Bibliographic Details
Published in	2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA) pp. 304 - 316
Main Authors	Tan, Cheng, Agostini, Nicolas Bohm, Geng, Tong, Xie, Chenhao, Li, Jiajia, Li, Ang, Barker, Kevin J., Tumeo, Antonino
Format	Conference Proceeding
Language	English
Published	IEEE 01.04.2022
Subjects	CGRA Computer architecture Logic gates Partial Reconfiguration Pipelines Prototypes Reconfigurable devices Software Spatial Accelerator Streaming Application Throughput
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Coarse-grained reconfigurable arrays (CGRAs) provide higher flexibility than application-specific integrated circuits (ASICs) and higher efficiency than fine-grained reconfigurable devices such as Field Programmable Gate Arrays (FPGAs). However, CGRAs are generally designed to support offloading of a single kernel. While the CGRA design, based on communicating functional units, appears to naturally suit data streaming applications composed of multiple cooperating kernels, current approaches only statically partition the resources across application kernels. However, emerging streaming applications at the edge (scientific instruments, sensor networks, network processing) perform much more than digital signal processing and often are data and input dependent. This leads to extremely variable kernel execution times, severely impacting the throughput of the entire pipeline if resources are only statically allocated. Therefore, in this paper, we propose DRIPS - a novel CGRA architecture that can dynamically rebalance the pipeline of data-dependent streaming applications. We present a unified compiler framework to facilitate the mapping of a given streaming application onto the DRIPS CGRA architecture. The experimental results show that DRIPS achieves an average throughput improvement of 1.46× across a set of representative applications over a statically partitioned solution. The additional area overhead to enable dynamic rebalancing consumes 16.34% of the entire area for a 5×5 CGRA prototype.
ISSN:	2378-203X
DOI:	10.1109/HPCA53966.2022.00030