On the Design of Time-Constrained and Buffer-Optimal Self-Timed Pipelines

Pipelining is a powerful technique to achieve high performance in computing systems. However, as computing platforms become large-scale and integrate with heterogeneous processing elements (PEs) (CPUs, GPUs, field-programmable gate arrays, etc.), it is difficult to employ a global clock to achieve s...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on computer-aided design of integrated circuits and systems Vol. 38; no. 8; pp. 1515 - 1528
Main Authors	Jiang, Weiwen, Sha, Edwin Hsing-Mean, Zhuge, Qingfeng, Yang, Lei, Chen, Xianzhang, Hu, Jingtong
Format	Journal Article
Language	English
Published	New York IEEE 01.08.2019 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Buffers Central processing units Clocks Computation Constraints CPUs Field programmable gate arrays Fires Formulations Graph theory Integer programming Linear programming Optimization algorithms performance bottleneck Pipeline processing pipeline systems Pipelines Pipelining (computers) self-timed communications Synchronization system model
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Pipelining is a powerful technique to achieve high performance in computing systems. However, as computing platforms become large-scale and integrate with heterogeneous processing elements (PEs) (CPUs, GPUs, field-programmable gate arrays, etc.), it is difficult to employ a global clock to achieve synchronous pipelines. Therefore, self-timed (or asynchronous) pipelines are usually adopted. Nevertheless, due to their complex running behavior, the performance modeling and systematic optimizations for self-timed pipeline (STP) systems are more complicated than those for synchronous ones. This paper employs marked graph theory to model STPs and presents algorithms to detect performance bottlenecks. Based on the proposed model, we observe that the system performance can be improved by inserting buffers. Due to the limited memory resources on the PEs, it is critical to minimize the number of buffers for STPs while satisfying the required timing constraints. In this paper, we propose integer linear programming formulations to obtain the optimal solutions and devise efficient algorithms to obtain the near-optimal solutions. Experimental results show that the proposed algorithms can achieve 53.10% improvement in the maximum performance and 54.04% reduction in the number of buffers, compared with the technique for the slack matching problem.
ISSN:	0278-0070 1937-4151
DOI:	10.1109/TCAD.2018.2846642