NTGAT A Graph Attention Network Accelerator with Runtime Node Tailoring
Graph Attention Network (GAT) has demonstrated better performance in many graph tasks than previous Graph Neural Networks (GNN). However, it involves graph attention operations with extra computing complexity. While a large amount of existing literature has researched GNN acceleration, few have focu...
Saved in:
Published in | Proceedings of the 28th Asia and South Pacific Design Automation Conference pp. 645 - 650 |
---|---|
Main Authors | , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
New York, NY, USA
ACM
16.01.2023
|
Series | ACM Conferences |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Graph Attention Network (GAT) has demonstrated better performance in many graph tasks than previous Graph Neural Networks (GNN). However, it involves graph attention operations with extra computing complexity. While a large amount of existing literature has researched GNN acceleration, few have focused on the attention mechanism in GAT. The graph attention mechanism makes the computation flow different. Therefore, previous GNN accelerators can not support GAT well. Besides, GAT distinguishes the importance of neighbors and makes it possible to reduce the workload through runtime tailoring. We present NTGAT, a software-hardware co-design approach to accelerate GAT with runtime node tailoring. Our work comprises both a runtime node tailoring algorithm and an accelerator design. We propose a pipeline sorting method and a hardware unit to support node tailoring during inference. The experiments show that our algorithm can reduce up to 86% of aggregation workload while incurring slight accuracy loss (<0.4%). And the FPGA based accelerator can achieve up to 3.8× speedup and 4.98× energy efficiency comparing to the GPU baseline. |
---|---|
ISBN: | 9781450397834 1450397832 |
DOI: | 10.1145/3566097.3567869 |