EnGN: A High-Throughput and Energy-Efficient Accelerator for Large Graph Neural Networks

Graph neural networks (GNNs) emerge as a powerful approach to process non-euclidean data structures and have been proved powerful in various application domains such as social networks and e-commerce. While such graph data maintained in real-world systems can be extremely large and sparse, thus empl...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on computers Vol. 70; no. 9; pp. 1511 - 1525
Main Authors	Liang, Shengwen, Wang, Ying, Liu, Cheng, He, Lei, Li, Huawei, Xu, Dawen, Li, Xiaowei
Format	Journal Article
Language	English
Published	New York IEEE 01.09.2021 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	accelerator architecture Apexes Central processing units CPUs Data structures Energy efficiency Feature extraction Graph neural network Graph neural networks Graphics processing units Hardware hardware acceleration Memory management Neural networks Social networks System-on-chip Task analysis Tiling
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Graph neural networks (GNNs) emerge as a powerful approach to process non-euclidean data structures and have been proved powerful in various application domains such as social networks and e-commerce. While such graph data maintained in real-world systems can be extremely large and sparse, thus employing GNNs to deal with them requires substantial computational and memory overhead, which induces considerable energy and resource cost on CPUs and GPUs. In this article, we present a specialized accelerator architecture, EnGN, to enable high-throughput and energy-efficient processing of large-scale GNNs. The proposed EnGN is designed to accelerate the three key stages of GNN propagation, which is abstracted as common computing patterns shared by typical GNNs. To support the key stages simultaneously, we propose the ring-edge-reduce(RER) dataflow that tames the poor locality of sparsely-and-randomly connected vertices, and the RER PE-array to practice RER dataflow. In addition, we utilize a graph tiling strategy to fit large graphs into EnGN and make good use of the hierarchical on-chip buffers through adaptive computation reordering and tile scheduling. Overall, EnGN achieves performance speedup by 1802.9X, 19.75X, and 2.97X and energy efficiency by 1326.35X, 304.43X, and 6.2X on average compared to CPU, GPU, and a state-of-the-art GCN accelerator HyGCN, respectively.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0018-9340 1557-9956
DOI:	10.1109/TC.2020.3014632