Software-hardware co-design for accelerating large-scale graph convolutional network inference on FPGA

Inspired by convolutional neural networks, graph convolutional networks (GCNs) have been proposed for processing non-Euclidean graph data and successfully been applied in recommendation systems, smart traffic, etc. However, subject to the sparsity and irregularity of GCN models, the complex executio...

Full description

Saved in:
Bibliographic Details
Published inNeurocomputing (Amsterdam) Vol. 532; pp. 129 - 140
Main Authors Ran, Shaolin, Zhao, Beizhen, Dai, Xing, Cheng, Cheng, Zhang, Yong
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.05.2023
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Inspired by convolutional neural networks, graph convolutional networks (GCNs) have been proposed for processing non-Euclidean graph data and successfully been applied in recommendation systems, smart traffic, etc. However, subject to the sparsity and irregularity of GCN models, the complex execution pattern of large-scale GCN poses huge challenges to the efficient inference on general purpose CPUs and GPUs, such as workload imbalance and irregular memory access. Therefore, we propose a software-hardware co-design framework for low-latency GCN inference on field programmable gate array. Specifically, at the algorithm level, we propose an attention-mechanism-based graph sparsification approach to reduce the redundant relation in the graph structure and alleviate irregularity without losing accuracy. Then, at the hardware design level, based on the sparsified graph, we propose a two-stage hardware architecture that supports the two phases with a distinct execution mode in the GCN. In order to achieve low-latency computation, edge-level and feature-level parallelism are exploited in the aggregation phase. In addition, a graph partition strategy is exploited to efficiently improve data reuse. The experimental results demonstrate that our proposed framework can achieve 739× speedup compared to CPU, 13.7× speedup compared to GPU on average and 6.8× speedup compared to state-of-the-art accelerators.
AbstractList Inspired by convolutional neural networks, graph convolutional networks (GCNs) have been proposed for processing non-Euclidean graph data and successfully been applied in recommendation systems, smart traffic, etc. However, subject to the sparsity and irregularity of GCN models, the complex execution pattern of large-scale GCN poses huge challenges to the efficient inference on general purpose CPUs and GPUs, such as workload imbalance and irregular memory access. Therefore, we propose a software-hardware co-design framework for low-latency GCN inference on field programmable gate array. Specifically, at the algorithm level, we propose an attention-mechanism-based graph sparsification approach to reduce the redundant relation in the graph structure and alleviate irregularity without losing accuracy. Then, at the hardware design level, based on the sparsified graph, we propose a two-stage hardware architecture that supports the two phases with a distinct execution mode in the GCN. In order to achieve low-latency computation, edge-level and feature-level parallelism are exploited in the aggregation phase. In addition, a graph partition strategy is exploited to efficiently improve data reuse. The experimental results demonstrate that our proposed framework can achieve 739× speedup compared to CPU, 13.7× speedup compared to GPU on average and 6.8× speedup compared to state-of-the-art accelerators.
Author Cheng, Cheng
Zhao, Beizhen
Ran, Shaolin
Dai, Xing
Zhang, Yong
Author_xml – sequence: 1
  givenname: Shaolin
  surname: Ran
  fullname: Ran, Shaolin
  organization: School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China
– sequence: 2
  givenname: Beizhen
  surname: Zhao
  fullname: Zhao, Beizhen
  organization: School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China
– sequence: 3
  givenname: Xing
  surname: Dai
  fullname: Dai, Xing
  organization: School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China
– sequence: 4
  givenname: Cheng
  surname: Cheng
  fullname: Cheng, Cheng
  organization: School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China
– sequence: 5
  givenname: Yong
  surname: Zhang
  fullname: Zhang, Yong
  email: zhangyong77@wust.edu.cn
  organization: School of Information Science and Engineering, Wuhan University of Science and Technology, Wuhan 430081, China
BookMark eNqFkM1KAzEUhYNUsK2-gYu8QMb8zK8LoRRbhYKCug6Z5KZNnSYlM23x7Z2hrlzo6p7F_Q6cb4JGPnhA6JbRhFGW320TDwcddgmnXCSUJ1TwCzRmZcFJyct8hMa04hnhgvErNGnbLaWsYLwaI_sWbHdSEchGRTMErAMx0Lq1xzZErLSGBqLqnF_jRsU1kFarBvA6qv2mf_bH0Bw6F7xqsIfuFOIndt5CBK8BB48Xr8vZNbq0qmnh5udO0cfi8X3-RFYvy-f5bEW0oHlHCpFnJqtSXmZUlxmrakErI8DYympdVznVaapMWbNciTpn3NCaWwClWF1kvBBTlJ57dQxtG8HKfXQ7Fb8ko3JwJbfy7EoOriTlsnfVY_e_MO06NYzqonLNf_DDGYZ-2NFBlK12w3jjIuhOmuD-LvgGfoeL9w
CitedBy_id crossref_primary_10_1016_j_neucom_2023_127210
crossref_primary_10_15446_dyna_v90n226_107112
crossref_primary_10_1080_02564602_2023_2297355
crossref_primary_10_3390_info15070377
Cites_doi 10.1109/TNSE.2022.3144484
10.1109/TPDS.2019.2910068
10.1007/s13278-016-0332-2
10.1016/j.neucom.2022.04.073
10.1109/LCA.2020.2970395
10.1016/j.neucom.2019.09.074
10.1016/j.neucom.2021.11.006
10.1016/j.sysarc.2021.102122
10.1109/ASAP49362.2020.00019
10.1093/nsr/nwz190
10.1145/3550075
10.1080/00207721.2020.1868615
10.1515/9781400841356.183
10.1016/j.neucom.2021.12.100
10.1016/j.neucom.2020.04.018
10.1016/j.ymssp.2022.109573
10.1109/TMECH.2020.2971503
10.1016/j.neucom.2022.06.076
10.1080/00207721.2021.1998722
10.1109/HPCA47549.2020.00012
10.1007/s13042-021-01285-w
10.1109/TC.2020.3014632
10.1109/TCSVT.2020.3020569
10.1145/3477141
10.1016/j.neucom.2021.06.065
10.1109/TAC.2021.3081256
10.1016/j.ress.2021.108263
10.1109/LCA.2017.2762308
ContentType Journal Article
Copyright 2023 Elsevier B.V.
Copyright_xml – notice: 2023 Elsevier B.V.
DBID AAYXX
CITATION
DOI 10.1016/j.neucom.2023.02.032
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1872-8286
EndPage 140
ExternalDocumentID 10_1016_j_neucom_2023_02_032
S0925231223001698
GroupedDBID ---
--K
--M
.DC
.~1
0R~
123
1B1
1~.
1~5
4.4
457
4G.
53G
5VS
7-5
71M
8P~
9JM
9JN
AABNK
AACTN
AADPK
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAXLA
AAXUO
AAYFN
ABBOA
ABCQJ
ABFNM
ABJNI
ABMAC
ABYKQ
ACDAQ
ACGFS
ACRLP
ACZNC
ADBBV
ADEZE
AEBSH
AEKER
AENEX
AFKWA
AFTJW
AFXIZ
AGHFR
AGUBO
AGWIK
AGYEJ
AHHHB
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
AXJTR
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EO8
EO9
EP2
EP3
F5P
FDB
FIRID
FNPLU
FYGXN
G-Q
GBLVA
GBOLZ
IHE
J1W
KOM
LG9
M41
MO0
MOBAO
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
ROL
RPZ
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SSN
SSV
SSZ
T5K
ZMT
~G-
29N
AAQXK
AATTM
AAXKI
AAYWO
AAYXX
ABWVN
ABXDB
ACNNM
ACRPL
ACVFH
ADCNI
ADJOM
ADMUD
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGCQF
AGQPQ
AGRNS
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
ASPBG
AVWKF
AZFZN
BNPGV
CITATION
EJD
FEDTE
FGOYB
HLZ
HVGLF
HZ~
R2-
RIG
SBC
SSH
WUQ
XPP
ID FETCH-LOGICAL-c306t-7365d5942850c8519b309d3edf9fccb960c44ad8b16a3b612d0b2feeaa1b75273
IEDL.DBID .~1
ISSN 0925-2312
IngestDate Thu Apr 24 23:11:11 EDT 2025
Tue Jul 01 04:24:52 EDT 2025
Fri Feb 23 02:37:57 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Software-hardware co-design
Graph convolutional network
Graph sparsification
FPGA
Hardware acceleration
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c306t-7365d5942850c8519b309d3edf9fccb960c44ad8b16a3b612d0b2feeaa1b75273
PageCount 12
ParticipantIDs crossref_primary_10_1016_j_neucom_2023_02_032
crossref_citationtrail_10_1016_j_neucom_2023_02_032
elsevier_sciencedirect_doi_10_1016_j_neucom_2023_02_032
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2023-05-01
2023-05-00
PublicationDateYYYYMMDD 2023-05-01
PublicationDate_xml – month: 05
  year: 2023
  text: 2023-05-01
  day: 01
PublicationDecade 2020
PublicationTitle Neurocomputing (Amsterdam)
PublicationYear 2023
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Cuthill, McKee (b0255) 1969
Tian, Ma, Yang, Dai (b0100) 2020
Zhu, Liu, Hou, Sun, Zheng (b0165) 2021; 457
P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, Y. Bengio, Graph attention networks, arXiv preprint arXiv:1710.10903 (2017).
Zhou, Kannan, Prasanna, Seetharaman, Wu (b0175) 2019; 30
Xiong (b0105) 2020
S. Brody, U. Alon, E. Yahav, How attentive are graph attention networks?, arXiv preprint arXiv:2105.14491 (2021).
Geng, Li, Shi, Wu, Wang, Li, Haghi, Tumeo, Che, Reinhardt (b0135) 2020
A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, J. Wiener, Graph structure in the web, in: The Structure and Dynamics of Networks, Princeton University Press, 2011, pp. 183–194.
Hamann, Lindner, Meyerhenke, Staudt, Wagner (b0260) 2016; 6
Yan, Chen, Deng, Ye, Zhang, Fan, Xie (b0125) 2020; 19
Chen, Xu, Huang, Deng, Huang, Wang, He, Li (b0235) 2020
Zhang, Xin, Liu, Chi, Ma (b0015) 2022; 220
Gao, He, Dong, Liu, Lyu (b0065) 2022
S. Abadal, A. Jain, R. Guirado, J. López-Alonso, E. Alarcón, Computing graph neural networks: A survey from algorithms to accelerators, arXiv preprint arXiv:2010.00130 (2020).
Wang, Zhang, Liu, Gu, Jing, Liu (b0085) 2021; 121
Nguyen, Kim, Lee (b0160) 2021; 31
Zhang, Kannan, Prasanna (b0210) 2021
Y. Rong, W. Huang, T. Xu, J. Huang, DropEdge: Towards deep graph convolutional networks on node classification, in: International Conference on Learning Representations, 2020.
M. Fey, J.E. Lenssen, Fast graph representation learning with PyTorch Geometric, arXiv preprint arXiv:1903.02428 (2019).
Yuan, Ma, Cheng, Zhou, Zhao, Zhang, Ding (b0025) 2020; 7
Mao, Sun, Yi, Liu, Ding (b0040) 2021; 52
Liang, Wang, Liu, He, Li, Xu, Li (b0130) 2021; 70
B. Zhang, H. Zeng, V. Prasanna, Hardware acceleration of large scale GCN inference, in: IEEE 31st International Conference on Application-specific Systems, Architectures and Processors, 2020, pp. 61–68.
Wang, Zhu, Chi, Xu (b0170) 2021; 116
Auten, Tomei, Kumar (b0200) 2020
Zou, Wang, Hu, Dong (b0045) 2021; 67
Wang, Fang, Zhang, Xiang, Pan (b0090) 2022; 471
Ma, Wang, Liu, Alsaadi, Alsaadi (b0050) 2022; 9
Chervyakov, Lyakhov, Deryabin, Nagornov, Valueva, Valuev (b0155) 2020; 407
Wen, Jiang, Xu, Wang, Xiao, Zhao, Dou (b0225) 2021
Zhang, Leng, Ma, Miao, Li, Guo (b0110) 2020; 19
Pei, Wang, Qin, Liang (b0220) 2021
Ma, Yang, Miao, Xue, Wu, Zhou, Dai (b0195) 2019
Xu, Dong, Pan, Chen (b0230) 2018; 45
Cheng, Ma, Zhang, Sun, Teng, Ding, Yuan (b0035) 2020; 25
Zhao, He, Ma, Liu (b0055) 2022; 493
Kipf, Welling (b0070) 2017
M. Yan, L. Deng, X. Hu, L. Liang, Y. Feng, X. Ye, Z. Zhang, D. Fan, Y. Xie, HyGCN: A GCN accelerator with hybrid architecture, in: IEEE International Symposium on High Performance Computer Architecture, 2020, pp. 15–29.
Zhu, Cheng, Luo, Yang, Luo, Qian, Zhou (b0075) 2022; 494
Wang (b0190) 2019
Sadi, Sweeney, Low, Hoe, Pileggi, Franchetti (b0180) 2019
Chung, Kim, Wen, Cong (b0120) 2012
Ju, Tian, Liu, Ma (b0060) 2021; 52
Kiningham, Levis, Ré (b0205) 2022
Z. Tao, C. Wu, Y. Liang, L. He, LW-GCN: A lightweight FPGA-based graph convolutional network accelerator, arXiv preprint arXiv:2111.03184 (2021).
Li, Louri, Karanth, Bunescu (b0150) 2021
Xue, Zhang, Cheng, Ma (b0030) 2020; 376
Y. Zhang, J. Sun, J. Zhang, H. Shen, Y. She, Y. Chang, Health state assessment of bearing with feature enhancement and prediction error compensation strategy, Mechanical Systems and Signal Processing 182 109573.
Luo, Cheng, Yu, Zong, Ni, Chen, Zhang (b0240) 2021
Ma, Ren, Khailany, Sikka, Luo, Natarajan, Yu (b0080) 2019
Zhang, Su, Liu, Tan, Jiang, Cheng (b0010) 2022; 503
Liu, Wang, Zeng, Alsaadi, Liu (b0005) 2021; 12
Zhang (10.1016/j.neucom.2023.02.032_b0015) 2022; 220
10.1016/j.neucom.2023.02.032_b0250
10.1016/j.neucom.2023.02.032_b0095
Gao (10.1016/j.neucom.2023.02.032_b0065) 2022
Xue (10.1016/j.neucom.2023.02.032_b0030) 2020; 376
Chen (10.1016/j.neucom.2023.02.032_b0235) 2020
Wang (10.1016/j.neucom.2023.02.032_b0190) 2019
Ma (10.1016/j.neucom.2023.02.032_b0080) 2019
Auten (10.1016/j.neucom.2023.02.032_b0200) 2020
Wang (10.1016/j.neucom.2023.02.032_b0085) 2021; 121
10.1016/j.neucom.2023.02.032_b0215
Cheng (10.1016/j.neucom.2023.02.032_b0035) 2020; 25
10.1016/j.neucom.2023.02.032_b0020
Zhu (10.1016/j.neucom.2023.02.032_b0165) 2021; 457
10.1016/j.neucom.2023.02.032_b0185
10.1016/j.neucom.2023.02.032_b0265
10.1016/j.neucom.2023.02.032_b0145
Luo (10.1016/j.neucom.2023.02.032_b0240) 2021
Zhang (10.1016/j.neucom.2023.02.032_b0010) 2022; 503
Ma (10.1016/j.neucom.2023.02.032_b0050) 2022; 9
Wen (10.1016/j.neucom.2023.02.032_b0225) 2021
Yuan (10.1016/j.neucom.2023.02.032_b0025) 2020; 7
10.1016/j.neucom.2023.02.032_b0140
Chervyakov (10.1016/j.neucom.2023.02.032_b0155) 2020; 407
Wang (10.1016/j.neucom.2023.02.032_b0090) 2022; 471
Tian (10.1016/j.neucom.2023.02.032_b0100) 2020
Yan (10.1016/j.neucom.2023.02.032_b0125) 2020; 19
Chung (10.1016/j.neucom.2023.02.032_b0120) 2012
Kipf (10.1016/j.neucom.2023.02.032_b0070) 2017
Zhu (10.1016/j.neucom.2023.02.032_b0075) 2022; 494
Xiong (10.1016/j.neucom.2023.02.032_b0105) 2020
10.1016/j.neucom.2023.02.032_b0115
Liu (10.1016/j.neucom.2023.02.032_b0005) 2021; 12
Zhou (10.1016/j.neucom.2023.02.032_b0175) 2019; 30
Nguyen (10.1016/j.neucom.2023.02.032_b0160) 2021; 31
Wang (10.1016/j.neucom.2023.02.032_b0170) 2021; 116
Xu (10.1016/j.neucom.2023.02.032_b0230) 2018; 45
Mao (10.1016/j.neucom.2023.02.032_b0040) 2021; 52
Sadi (10.1016/j.neucom.2023.02.032_b0180) 2019
Geng (10.1016/j.neucom.2023.02.032_b0135) 2020
Cuthill (10.1016/j.neucom.2023.02.032_b0255) 1969
Zou (10.1016/j.neucom.2023.02.032_b0045) 2021; 67
Zhao (10.1016/j.neucom.2023.02.032_b0055) 2022; 493
10.1016/j.neucom.2023.02.032_b0245
Zhang (10.1016/j.neucom.2023.02.032_b0110) 2020; 19
Liang (10.1016/j.neucom.2023.02.032_b0130) 2021; 70
Zhang (10.1016/j.neucom.2023.02.032_b0210) 2021
Hamann (10.1016/j.neucom.2023.02.032_b0260) 2016; 6
Ma (10.1016/j.neucom.2023.02.032_b0195) 2019
Li (10.1016/j.neucom.2023.02.032_b0150) 2021
Pei (10.1016/j.neucom.2023.02.032_b0220) 2021
Ju (10.1016/j.neucom.2023.02.032_b0060) 2021; 52
Kiningham (10.1016/j.neucom.2023.02.032_b0205) 2022
References_xml – volume: 471
  start-page: 118
  year: 2022
  end-page: 129
  ident: b0090
  article-title: TVGCN: Time-variant graph convolutional network for traffic forecasting
  publication-title: Neurocomputing
– volume: 121
  year: 2021
  ident: b0085
  article-title: A novel GCN-based point cloud classification model robust to pose variances
  publication-title: Pattern Recognition
– reference: Y. Rong, W. Huang, T. Xu, J. Huang, DropEdge: Towards deep graph convolutional networks on node classification, in: International Conference on Learning Representations, 2020.
– reference: M. Yan, L. Deng, X. Hu, L. Liang, Y. Feng, X. Ye, Z. Zhang, D. Fan, Y. Xie, HyGCN: A GCN accelerator with hybrid architecture, in: IEEE International Symposium on High Performance Computer Architecture, 2020, pp. 15–29.
– start-page: 1
  year: 2022
  end-page: 14
  ident: b0065
  article-title: A survey on fault-tolerant consensus control of multi-agent systems: trends, methodologies and prospects
  publication-title: International Journal of Systems Science
– start-page: 1
  year: 2019
  end-page: 6
  ident: b0080
  article-title: High performance graph convolutionai networks with applications in testability analysis
  publication-title: 56th ACM/IEEE Design Automation Conference
– volume: 70
  start-page: 1511
  year: 2021
  end-page: 1525
  ident: b0130
  article-title: EnGN, A high-throughput and energy-efficient accelerator for large graph neural networks
  publication-title: IEEE Transactions on Computers
– start-page: 1
  year: 2012
  end-page: 8
  ident: b0120
  article-title: Application data prefetching on the IBM blue Gene/Q supercomputer
  publication-title: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
– start-page: 922
  year: 2020
  end-page: 936
  ident: b0135
  article-title: AWB-GCN: A graph convolutional network accelerator with runtime workload rebalancing
  publication-title: 53rd Annual IEEE/ACM International Symposium on Microarchitecture
– year: 2022
  ident: b0205
  article-title: GRIP: A graph neural network accelerator architecture
  publication-title: IEEE Transactions on Computers
– start-page: 775
  year: 2021
  end-page: 788
  ident: b0150
  article-title: GCNAX: A flexible and energy-efficient accelerator for graph convolutional neural networks
  publication-title: 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA)
– volume: 6
  start-page: 1
  year: 2016
  end-page: 22
  ident: b0260
  article-title: Structure-preserving sparsification methods for social networks
  publication-title: Social Network Analysis and Mining
– year: 2019
  ident: b0190
  article-title: Deep graph library: Towards efficient and scalable deep learning on graphs
  publication-title: ICLR workshop on representation learning on graphs and manifolds
– reference: P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, Y. Bengio, Graph attention networks, arXiv preprint arXiv:1710.10903 (2017).
– year: 2017
  ident: b0070
  article-title: Semi-supervised classification with graph convolutional networks
  publication-title: Proceedings of 5th International Conference on Learning Representations
– volume: 52
  start-page: 3390
  year: 2021
  end-page: 3409
  ident: b0060
  article-title: Fault detection of networked dynamical systems: A survey of trends and techniques
  publication-title: International Journal of Systems Science
– start-page: 1
  year: 2020
  end-page: 6
  ident: b0200
  article-title: Hardware acceleration of graph neural networks
  publication-title: 57th ACM/IEEE Design Automation Conference
– volume: 9
  start-page: 1395
  year: 2022
  end-page: 1408
  ident: b0050
  article-title: Neural-network-based filtering for a general class of nonlinear systems under dynamically bounded innovations over sensor networks
  publication-title: IEEE Transactions on Network Science and Engineering
– volume: 25
  start-page: 1243
  year: 2020
  end-page: 1254
  ident: b0035
  article-title: A deep learning-based remaining useful life prediction approach for bearings
  publication-title: IEEE/ASME transactions on mechatronics
– reference: A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, J. Wiener, Graph structure in the web, in: The Structure and Dynamics of Networks, Princeton University Press, 2011, pp. 183–194.
– volume: 19
  start-page: 22
  year: 2020
  end-page: 25
  ident: b0125
  article-title: Characterizing and understanding GCNs on GPU
  publication-title: IEEE Computer Architecture Letters
– reference: M. Fey, J.E. Lenssen, Fast graph representation learning with PyTorch Geometric, arXiv preprint arXiv:1903.02428 (2019).
– start-page: 779
  year: 2021
  end-page: 787
  ident: b0240
  article-title: Learning to drop: Robust graph neural network via topological denoising
  publication-title: Proceedings of the 14th ACM International Conference on Web Search and Data Mining
– volume: 376
  start-page: 95
  year: 2020
  end-page: 102
  ident: b0030
  article-title: Remaining useful life prediction of lithium-ion batteries with adaptive unscented kalman filter and optimized support vector regression
  publication-title: Neurocomputing
– reference: Y. Zhang, J. Sun, J. Zhang, H. Shen, Y. She, Y. Chang, Health state assessment of bearing with feature enhancement and prediction error compensation strategy, Mechanical Systems and Signal Processing 182 109573.
– reference: S. Brody, U. Alon, E. Yahav, How attentive are graph attention networks?, arXiv preprint arXiv:2105.14491 (2021).
– volume: 503
  start-page: 314
  year: 2022
  end-page: 324
  ident: b0010
  article-title: A semi-supervised learning approach for COVID-19 detection from chest CT scans
  publication-title: Neurocomputing
– volume: 7
  start-page: 418
  year: 2020
  end-page: 429
  ident: b0025
  article-title: A general end-to-end diagnosis framework for manufacturing systems
  publication-title: National Science Review
– volume: 52
  start-page: 1110
  year: 2021
  end-page: 1128
  ident: b0040
  article-title: Recursive filtering of networked nonlinear systems: A survey
  publication-title: International Journal of Systems Science
– volume: 493
  start-page: 583
  year: 2022
  end-page: 591
  ident: b0055
  article-title: Estimator-based iterative deviation-free residual generator for fault detection under random access protocol
  publication-title: Neurocomputing
– reference: S. Abadal, A. Jain, R. Guirado, J. López-Alonso, E. Alarcón, Computing graph neural networks: A survey from algorithms to accelerators, arXiv preprint arXiv:2010.00130 (2020).
– start-page: 936
  year: 2020
  end-page: 945
  ident: b0100
  article-title: Pcgcn: Partition-centric processing for accelerating graph convolutional network
  publication-title: IEEE International Parallel and Distributed Processing Symposium
– start-page: 33
  year: 2021
  end-page: 40
  ident: b0225
  article-title: RFC-HyPGCN: A runtime sparse feature compress accelerator for skeleton-based GCNs action recognition model with hybrid pruning
  publication-title: 2021 IEEE 32nd International Conference on Application-specific Systems, Architectures and Processors (ASAP)
– start-page: 1469
  year: 2021
  end-page: 1474
  ident: b0220
  article-title: STARS: Spatial temporal graph convolution network for action recognition system on FPGAs
  publication-title: 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC)
– volume: 19
  start-page: 59
  year: 2020
  end-page: 62
  ident: b0110
  article-title: Architectural implications of graph neural networks
  publication-title: IEEE Computer architecture letters
– volume: 31
  start-page: 2450
  year: 2021
  end-page: 2464
  ident: b0160
  article-title: Layer-specific optimization for mixed data flow with mixed precision in FPGA design for CNN-based object detectors
  publication-title: IEEE Transactions on Circuits and Systems for Video Technology
– start-page: 157
  year: 1969
  end-page: 172
  ident: b0255
  article-title: Reducing the bandwidth of sparse symmetric matrices
  publication-title: Proceedings of the 24th National Conference
– volume: 407
  start-page: 439
  year: 2020
  end-page: 453
  ident: b0155
  article-title: Residue number system-based solution for reducing the hardware cost of a convolutional neural network
  publication-title: Neurocomputing
– start-page: 443
  year: 2019
  end-page: 458
  ident: b0195
  article-title: Neugraph: parallel deep neural network computation on large graphs
  publication-title: Annual Technical Conference
– start-page: 347
  year: 2019
  end-page: 358
  ident: b0180
  article-title: Efficient spmv operation for large and highly sparse matrices using scalable multi-way merge parallelization
  publication-title: Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture
– volume: 67
  start-page: 304
  year: 2021
  end-page: 319
  ident: b0045
  article-title: Ultimately bounded filtering subject to impulsive measurement outliers
  publication-title: IEEE Transactions on Automatic Control
– volume: 220
  year: 2022
  ident: b0015
  article-title: Health status assessment and remaining useful life prediction of aero-engine based on BiGRU and MMoE
  publication-title: Reliability Engineering & System Safety
– volume: 30
  start-page: 2249
  year: 2019
  end-page: 2264
  ident: b0175
  article-title: Hitgraph: High-throughput graph processing framework on FPGA
  publication-title: IEEE Transactions on Parallel and Distributed Systems
– reference: B. Zhang, H. Zeng, V. Prasanna, Hardware acceleration of large scale GCN inference, in: IEEE 31st International Conference on Application-specific Systems, Architectures and Processors, 2020, pp. 61–68.
– volume: 12
  start-page: 1939
  year: 2021
  end-page: 1948
  ident: b0005
  article-title: A PSO-based deep learning approach to classifying patients from emergency departments
  publication-title: International Journal of Machine Learning and Cybernetics
– start-page: 92
  year: 2020
  end-page: 96
  ident: b0105
  article-title: A survey of FPGA based on graph convolutional neural network accelerator
  publication-title: International Conference on Computer Engineering and Intelligent Control
– start-page: 29
  year: 2021
  end-page: 39
  ident: b0210
  article-title: BoostGCN: A framework for optimizing GCN inference on FPGA
  publication-title: IEEE 29th Annual International Symposium on Field-Programmable Custom Computing Machines
– volume: 457
  start-page: 141
  year: 2021
  end-page: 154
  ident: b0165
  article-title: HSC: Leveraging horizontal shortcut connections for improving accuracy and computational efficiency of lightweight CNN
  publication-title: Neurocomputing
– volume: 494
  start-page: 33
  year: 2022
  end-page: 42
  ident: b0075
  article-title: SI-News: Integrating social information for news recommendation with attention-based graph convolutional network
  publication-title: Neurocomputing
– volume: 45
  start-page: 24
  year: 2018
  end-page: 30
  ident: b0230
  article-title: Survey of graph sparsification algorithms for complex networks
  publication-title: Computer Science
– start-page: 1977
  year: 2020
  end-page: 1980
  ident: b0235
  article-title: Label-aware graph convolutional networks
  publication-title: Proceedings of the 29th ACM International Conference on Information & Knowledge Management
– reference: Z. Tao, C. Wu, Y. Liang, L. He, LW-GCN: A lightweight FPGA-based graph convolutional network accelerator, arXiv preprint arXiv:2111.03184 (2021).
– volume: 116
  year: 2021
  ident: b0170
  article-title: S-CNN-ESystem: An end-to-end embedded CNN inference system with low hardware cost and hardware-software time-balancing
  publication-title: Journal of Systems Architecture
– volume: 9
  start-page: 1395
  issue: 3
  year: 2022
  ident: 10.1016/j.neucom.2023.02.032_b0050
  article-title: Neural-network-based filtering for a general class of nonlinear systems under dynamically bounded innovations over sensor networks
  publication-title: IEEE Transactions on Network Science and Engineering
  doi: 10.1109/TNSE.2022.3144484
– volume: 30
  start-page: 2249
  issue: 10
  year: 2019
  ident: 10.1016/j.neucom.2023.02.032_b0175
  article-title: Hitgraph: High-throughput graph processing framework on FPGA
  publication-title: IEEE Transactions on Parallel and Distributed Systems
  doi: 10.1109/TPDS.2019.2910068
– start-page: 1
  year: 2019
  ident: 10.1016/j.neucom.2023.02.032_b0080
  article-title: High performance graph convolutionai networks with applications in testability analysis
– volume: 6
  start-page: 1
  issue: 1
  year: 2016
  ident: 10.1016/j.neucom.2023.02.032_b0260
  article-title: Structure-preserving sparsification methods for social networks
  publication-title: Social Network Analysis and Mining
  doi: 10.1007/s13278-016-0332-2
– start-page: 33
  year: 2021
  ident: 10.1016/j.neucom.2023.02.032_b0225
  article-title: RFC-HyPGCN: A runtime sparse feature compress accelerator for skeleton-based GCNs action recognition model with hybrid pruning
– volume: 494
  start-page: 33
  year: 2022
  ident: 10.1016/j.neucom.2023.02.032_b0075
  article-title: SI-News: Integrating social information for news recommendation with attention-based graph convolutional network
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2022.04.073
– start-page: 347
  year: 2019
  ident: 10.1016/j.neucom.2023.02.032_b0180
  article-title: Efficient spmv operation for large and highly sparse matrices using scalable multi-way merge parallelization
– volume: 19
  start-page: 22
  issue: 1
  year: 2020
  ident: 10.1016/j.neucom.2023.02.032_b0125
  article-title: Characterizing and understanding GCNs on GPU
  publication-title: IEEE Computer Architecture Letters
  doi: 10.1109/LCA.2020.2970395
– start-page: 1
  year: 2020
  ident: 10.1016/j.neucom.2023.02.032_b0200
  article-title: Hardware acceleration of graph neural networks
– volume: 376
  start-page: 95
  year: 2020
  ident: 10.1016/j.neucom.2023.02.032_b0030
  article-title: Remaining useful life prediction of lithium-ion batteries with adaptive unscented kalman filter and optimized support vector regression
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2019.09.074
– start-page: 92
  year: 2020
  ident: 10.1016/j.neucom.2023.02.032_b0105
  article-title: A survey of FPGA based on graph convolutional neural network accelerator
– volume: 471
  start-page: 118
  year: 2022
  ident: 10.1016/j.neucom.2023.02.032_b0090
  article-title: TVGCN: Time-variant graph convolutional network for traffic forecasting
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2021.11.006
– year: 2022
  ident: 10.1016/j.neucom.2023.02.032_b0205
  article-title: GRIP: A graph neural network accelerator architecture
  publication-title: IEEE Transactions on Computers
– year: 2017
  ident: 10.1016/j.neucom.2023.02.032_b0070
  article-title: Semi-supervised classification with graph convolutional networks
– start-page: 157
  year: 1969
  ident: 10.1016/j.neucom.2023.02.032_b0255
  article-title: Reducing the bandwidth of sparse symmetric matrices
– volume: 116
  year: 2021
  ident: 10.1016/j.neucom.2023.02.032_b0170
  article-title: S-CNN-ESystem: An end-to-end embedded CNN inference system with low hardware cost and hardware-software time-balancing
  publication-title: Journal of Systems Architecture
  doi: 10.1016/j.sysarc.2021.102122
– ident: 10.1016/j.neucom.2023.02.032_b0265
– ident: 10.1016/j.neucom.2023.02.032_b0215
  doi: 10.1109/ASAP49362.2020.00019
– start-page: 1
  year: 2012
  ident: 10.1016/j.neucom.2023.02.032_b0120
  article-title: Application data prefetching on the IBM blue Gene/Q supercomputer
– volume: 7
  start-page: 418
  issue: 2
  year: 2020
  ident: 10.1016/j.neucom.2023.02.032_b0025
  article-title: A general end-to-end diagnosis framework for manufacturing systems
  publication-title: National Science Review
  doi: 10.1093/nsr/nwz190
– start-page: 922
  year: 2020
  ident: 10.1016/j.neucom.2023.02.032_b0135
  article-title: AWB-GCN: A graph convolutional network accelerator with runtime workload rebalancing
– ident: 10.1016/j.neucom.2023.02.032_b0250
– ident: 10.1016/j.neucom.2023.02.032_b0145
  doi: 10.1145/3550075
– start-page: 443
  year: 2019
  ident: 10.1016/j.neucom.2023.02.032_b0195
  article-title: Neugraph: parallel deep neural network computation on large graphs
– volume: 45
  start-page: 24
  issue: 5
  year: 2018
  ident: 10.1016/j.neucom.2023.02.032_b0230
  article-title: Survey of graph sparsification algorithms for complex networks
  publication-title: Computer Science
– start-page: 936
  year: 2020
  ident: 10.1016/j.neucom.2023.02.032_b0100
  article-title: Pcgcn: Partition-centric processing for accelerating graph convolutional network
– start-page: 1977
  year: 2020
  ident: 10.1016/j.neucom.2023.02.032_b0235
  article-title: Label-aware graph convolutional networks
– start-page: 775
  year: 2021
  ident: 10.1016/j.neucom.2023.02.032_b0150
  article-title: GCNAX: A flexible and energy-efficient accelerator for graph convolutional neural networks
– volume: 52
  start-page: 1110
  issue: 6
  year: 2021
  ident: 10.1016/j.neucom.2023.02.032_b0040
  article-title: Recursive filtering of networked nonlinear systems: A survey
  publication-title: International Journal of Systems Science
  doi: 10.1080/00207721.2020.1868615
– ident: 10.1016/j.neucom.2023.02.032_b0115
  doi: 10.1515/9781400841356.183
– volume: 493
  start-page: 583
  year: 2022
  ident: 10.1016/j.neucom.2023.02.032_b0055
  article-title: Estimator-based iterative deviation-free residual generator for fault detection under random access protocol
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2021.12.100
– volume: 407
  start-page: 439
  year: 2020
  ident: 10.1016/j.neucom.2023.02.032_b0155
  article-title: Residue number system-based solution for reducing the hardware cost of a convolutional neural network
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2020.04.018
– ident: 10.1016/j.neucom.2023.02.032_b0245
– ident: 10.1016/j.neucom.2023.02.032_b0020
  doi: 10.1016/j.ymssp.2022.109573
– volume: 25
  start-page: 1243
  issue: 3
  year: 2020
  ident: 10.1016/j.neucom.2023.02.032_b0035
  article-title: A deep learning-based remaining useful life prediction approach for bearings
  publication-title: IEEE/ASME transactions on mechatronics
  doi: 10.1109/TMECH.2020.2971503
– start-page: 1469
  year: 2021
  ident: 10.1016/j.neucom.2023.02.032_b0220
  article-title: STARS: Spatial temporal graph convolution network for action recognition system on FPGAs
– volume: 503
  start-page: 314
  year: 2022
  ident: 10.1016/j.neucom.2023.02.032_b0010
  article-title: A semi-supervised learning approach for COVID-19 detection from chest CT scans
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2022.06.076
– volume: 52
  start-page: 3390
  issue: 16
  year: 2021
  ident: 10.1016/j.neucom.2023.02.032_b0060
  article-title: Fault detection of networked dynamical systems: A survey of trends and techniques
  publication-title: International Journal of Systems Science
  doi: 10.1080/00207721.2021.1998722
– ident: 10.1016/j.neucom.2023.02.032_b0140
  doi: 10.1109/HPCA47549.2020.00012
– volume: 12
  start-page: 1939
  issue: 7
  year: 2021
  ident: 10.1016/j.neucom.2023.02.032_b0005
  article-title: A PSO-based deep learning approach to classifying patients from emergency departments
  publication-title: International Journal of Machine Learning and Cybernetics
  doi: 10.1007/s13042-021-01285-w
– volume: 70
  start-page: 1511
  issue: 9
  year: 2021
  ident: 10.1016/j.neucom.2023.02.032_b0130
  article-title: EnGN, A high-throughput and energy-efficient accelerator for large graph neural networks
  publication-title: IEEE Transactions on Computers
  doi: 10.1109/TC.2020.3014632
– volume: 31
  start-page: 2450
  issue: 6
  year: 2021
  ident: 10.1016/j.neucom.2023.02.032_b0160
  article-title: Layer-specific optimization for mixed data flow with mixed precision in FPGA design for CNN-based object detectors
  publication-title: IEEE Transactions on Circuits and Systems for Video Technology
  doi: 10.1109/TCSVT.2020.3020569
– ident: 10.1016/j.neucom.2023.02.032_b0095
  doi: 10.1145/3477141
– volume: 457
  start-page: 141
  year: 2021
  ident: 10.1016/j.neucom.2023.02.032_b0165
  article-title: HSC: Leveraging horizontal shortcut connections for improving accuracy and computational efficiency of lightweight CNN
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2021.06.065
– start-page: 29
  year: 2021
  ident: 10.1016/j.neucom.2023.02.032_b0210
  article-title: BoostGCN: A framework for optimizing GCN inference on FPGA
– volume: 67
  start-page: 304
  issue: 1
  year: 2021
  ident: 10.1016/j.neucom.2023.02.032_b0045
  article-title: Ultimately bounded filtering subject to impulsive measurement outliers
  publication-title: IEEE Transactions on Automatic Control
  doi: 10.1109/TAC.2021.3081256
– start-page: 779
  year: 2021
  ident: 10.1016/j.neucom.2023.02.032_b0240
  article-title: Learning to drop: Robust graph neural network via topological denoising
– ident: 10.1016/j.neucom.2023.02.032_b0185
– volume: 220
  year: 2022
  ident: 10.1016/j.neucom.2023.02.032_b0015
  article-title: Health status assessment and remaining useful life prediction of aero-engine based on BiGRU and MMoE
  publication-title: Reliability Engineering & System Safety
  doi: 10.1016/j.ress.2021.108263
– year: 2019
  ident: 10.1016/j.neucom.2023.02.032_b0190
  article-title: Deep graph library: Towards efficient and scalable deep learning on graphs
– volume: 19
  start-page: 59
  issue: 1
  year: 2020
  ident: 10.1016/j.neucom.2023.02.032_b0110
  article-title: Architectural implications of graph neural networks
  publication-title: IEEE Computer architecture letters
  doi: 10.1109/LCA.2017.2762308
– start-page: 1
  year: 2022
  ident: 10.1016/j.neucom.2023.02.032_b0065
  article-title: A survey on fault-tolerant consensus control of multi-agent systems: trends, methodologies and prospects
  publication-title: International Journal of Systems Science
– volume: 121
  year: 2021
  ident: 10.1016/j.neucom.2023.02.032_b0085
  article-title: A novel GCN-based point cloud classification model robust to pose variances
  publication-title: Pattern Recognition
SSID ssj0017129
Score 2.4021244
Snippet Inspired by convolutional neural networks, graph convolutional networks (GCNs) have been proposed for processing non-Euclidean graph data and successfully been...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 129
SubjectTerms FPGA
Graph convolutional network
Graph sparsification
Hardware acceleration
Software-hardware co-design
Title Software-hardware co-design for accelerating large-scale graph convolutional network inference on FPGA
URI https://dx.doi.org/10.1016/j.neucom.2023.02.032
Volume 532
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEF6KXrz4Fuuj7MHr2iSbTbrHUqxVsQi10Nuyu9lIpaSltnjztzuTbIqCKHhLQiYJM5OZb-CbGUKuNHcdkwrBEh3mDKKfZFo4zayNQwsANXUWC8XHYTIYx_cTMWmQXt0Lg7RKH_urmF5Ga3-l7bXZXkyn7VEgI6iiQshviFskNvzGcYpefv2xoXmEaRhV8_YiwfDuun2u5HgVbo2cEVwhXk7u5NHP6elLyunvk12PFWm3-pwD0nDFIdmr9zBQ_1sekXwEsfRdLx3DFio8oHbOspKbQQGUUm0tZBe0dfFCZ8j9Zm9gG0fLcdUUmefeA-F1RUUMp9O6FZDOC9p_uu0ek3H_5rk3YH5_ArNQCKxYyhORCQkFhggsICtpeCAz7rJc5tYaqF1sHOusY8JEcwNQJwtMlDundQjmA1xzQraKeeFOCU2k5SAonDBQkRnTiYWER2sorIWzXDQJr9WmrB8ujjsuZqpmkb2qStkKla2CSIGym4RtpBbVcI0_7k9ri6hvTqIg_v8qefZvyXOyg2cVx_GCbK2Wa3cJOGRlWqWjtch29-5hMPwEmV7ewQ
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3dS8MwED90PuiL3-L8zIOvYW3TtMvjGM7NjyG4wd5CkqYyGZ3ohv--lzYdCqLgW2l7bblL735HfncHcKWYbeuUc5qoMKfo_QRV3CpqTBwaBKipNS5RfBgm_XF8O-GTNejWtTCOVul9f-XTS2_tz7S8Nluv02nrKRARZlEhxjeHW0R7HTZcdyregI3O4K4_XG0mpGFUtdyLOHUCdQVdSfMq7NLRRtwU8bJ5J4t-jlBfok5vF7Y9XCSd6ov2YM0W-7BTj2Ig_s88gPwJ3emHerPUVVG5A2LmNCvpGQRxKVHGYIBx5i6eyczRv-k7mseSsmM1ceRzvwjxdUXFDSfTuhqQzAvSe7zpHMK4dz3q9qkfoUAN5gILmrKEZ1xgjsEDg-BKaBaIjNksF7kxGtMXE8cqa-swUUwj2skCHeXWKhWiBRHaHEGjmBf2GEgiDENBbrnGpEzrdswFPlphbs2tYbwJrFabNL6_uBtzMZM1kexFVsqWTtkyiCQquwl0JfVa9df44_60toj8tk4khoBfJU_-LXkJm_3Rw728HwzvTmHLXakoj2fQWLwt7TnCkoW-8MvuE0HY4XI
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Software-hardware+co-design+for+accelerating+large-scale+graph+convolutional+network+inference+on+FPGA&rft.jtitle=Neurocomputing+%28Amsterdam%29&rft.au=Ran%2C+Shaolin&rft.au=Zhao%2C+Beizhen&rft.au=Dai%2C+Xing&rft.au=Cheng%2C+Cheng&rft.date=2023-05-01&rft.issn=0925-2312&rft.volume=532&rft.spage=129&rft.epage=140&rft_id=info:doi/10.1016%2Fj.neucom.2023.02.032&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_neucom_2023_02_032
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0925-2312&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0925-2312&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0925-2312&client=summon