Learning-Based Application-Agnostic 3D NoC Design for Heterogeneous Manycore Systems

The rising use of deep learning and other big-data algorithms has led to an increasing demand for hardware platforms that are computationally powerful, yet energy-efficient. Due to the amount of data parallelism in these algorithms, high-performance three-dimensional (3D) manycore platforms that inc...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on computers Vol. 68; no. 6; pp. 852 - 866
Main Authors Joardar, Biresh Kumar, Kim, Ryan Gary, Doppa, Janardhan Rao, Pande, Partha Pratim, Marculescu, Diana, Marculescu, Radu
Format Journal Article
LanguageEnglish
Published New York IEEE 01.06.2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The rising use of deep learning and other big-data algorithms has led to an increasing demand for hardware platforms that are computationally powerful, yet energy-efficient. Due to the amount of data parallelism in these algorithms, high-performance three-dimensional (3D) manycore platforms that incorporate both CPUs and GPUs present a promising direction. However, as systems use heterogeneity (e.g., a combination of CPUs, GPUs, and accelerators) to improve performance and efficiency, it becomes more pertinent to address the distinct and likely conflicting communication requirements (e.g., CPU memory access latency or GPU network throughput) that arise from such heterogeneity. Unfortunately, it is difficult to quickly explore the hardware design space and choose appropriate tradeoffs between these heterogeneous requirements. To address these challenges, we propose the design of a 3D Network-on-Chip (NoC) for heterogeneous manycore platforms that considers the appropriate design objectives for a 3D heterogeneous system and explores various tradeoffs using an efficient machine learning (ML)-based multi-objective optimization (MOO) technique. The proposed design space exploration considers the various requirements of its heterogeneous components and generates a set of 3D NoC architectures that efficiently trades off these design objectives. Our findings show that by jointly considering these requirements (latency, throughput, temperature, and energy), we can achieve 9.6 percent better Energy-Delay Product on average at nearly iso-temperature conditions when compared to a thermally-optimized design for 3D heterogeneous NoCs. More importantly, our results suggest that our 3D NoCs optimized for a few applications can be generalized for unknown applications as well. Our results show that these generalized 3D NoCs only incur a 1.8 percent (36-tile system) and 1.1 percent (64-tile system) average performance loss compared to application-specific NoCs.
ISSN:0018-9340
1557-9956
DOI:10.1109/TC.2018.2889053