NTGAT A Graph Attention Network Accelerator with Runtime Node Tailoring

Graph Attention Network (GAT) has demonstrated better performance in many graph tasks than previous Graph Neural Networks (GNN). However, it involves graph attention operations with extra computing complexity. While a large amount of existing literature has researched GNN acceleration, few have focu...

Full description

Saved in:
Bibliographic Details
Published inProceedings of the 28th Asia and South Pacific Design Automation Conference pp. 645 - 650
Main Authors Hou, Wentao, Zhong, Kai, Zeng, Shulin, Dai, Guohao, Yang, Huazhong, Wang, Yu
Format Conference Proceeding
LanguageEnglish
Published New York, NY, USA ACM 16.01.2023
SeriesACM Conferences
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Graph Attention Network (GAT) has demonstrated better performance in many graph tasks than previous Graph Neural Networks (GNN). However, it involves graph attention operations with extra computing complexity. While a large amount of existing literature has researched GNN acceleration, few have focused on the attention mechanism in GAT. The graph attention mechanism makes the computation flow different. Therefore, previous GNN accelerators can not support GAT well. Besides, GAT distinguishes the importance of neighbors and makes it possible to reduce the workload through runtime tailoring. We present NTGAT, a software-hardware co-design approach to accelerate GAT with runtime node tailoring. Our work comprises both a runtime node tailoring algorithm and an accelerator design. We propose a pipeline sorting method and a hardware unit to support node tailoring during inference. The experiments show that our algorithm can reduce up to 86% of aggregation workload while incurring slight accuracy loss (<0.4%). And the FPGA based accelerator can achieve up to 3.8× speedup and 4.98× energy efficiency comparing to the GPU baseline.
ISBN:9781450397834
1450397832
DOI:10.1145/3566097.3567869