Software-hardware co-design for accelerating large-scale graph convolutional network inference on FPGA
Inspired by convolutional neural networks, graph convolutional networks (GCNs) have been proposed for processing non-Euclidean graph data and successfully been applied in recommendation systems, smart traffic, etc. However, subject to the sparsity and irregularity of GCN models, the complex executio...
Saved in:
Published in | Neurocomputing (Amsterdam) Vol. 532; pp. 129 - 140 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
01.05.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Inspired by convolutional neural networks, graph convolutional networks (GCNs) have been proposed for processing non-Euclidean graph data and successfully been applied in recommendation systems, smart traffic, etc. However, subject to the sparsity and irregularity of GCN models, the complex execution pattern of large-scale GCN poses huge challenges to the efficient inference on general purpose CPUs and GPUs, such as workload imbalance and irregular memory access. Therefore, we propose a software-hardware co-design framework for low-latency GCN inference on field programmable gate array. Specifically, at the algorithm level, we propose an attention-mechanism-based graph sparsification approach to reduce the redundant relation in the graph structure and alleviate irregularity without losing accuracy. Then, at the hardware design level, based on the sparsified graph, we propose a two-stage hardware architecture that supports the two phases with a distinct execution mode in the GCN. In order to achieve low-latency computation, edge-level and feature-level parallelism are exploited in the aggregation phase. In addition, a graph partition strategy is exploited to efficiently improve data reuse. The experimental results demonstrate that our proposed framework can achieve 739× speedup compared to CPU, 13.7× speedup compared to GPU on average and 6.8× speedup compared to state-of-the-art accelerators. |
---|---|
AbstractList | Inspired by convolutional neural networks, graph convolutional networks (GCNs) have been proposed for processing non-Euclidean graph data and successfully been applied in recommendation systems, smart traffic, etc. However, subject to the sparsity and irregularity of GCN models, the complex execution pattern of large-scale GCN poses huge challenges to the efficient inference on general purpose CPUs and GPUs, such as workload imbalance and irregular memory access. Therefore, we propose a software-hardware co-design framework for low-latency GCN inference on field programmable gate array. Specifically, at the algorithm level, we propose an attention-mechanism-based graph sparsification approach to reduce the redundant relation in the graph structure and alleviate irregularity without losing accuracy. Then, at the hardware design level, based on the sparsified graph, we propose a two-stage hardware architecture that supports the two phases with a distinct execution mode in the GCN. In order to achieve low-latency computation, edge-level and feature-level parallelism are exploited in the aggregation phase. In addition, a graph partition strategy is exploited to efficiently improve data reuse. The experimental results demonstrate that our proposed framework can achieve 739× speedup compared to CPU, 13.7× speedup compared to GPU on average and 6.8× speedup compared to state-of-the-art accelerators. |
Author | Cheng, Cheng Zhao, Beizhen Ran, Shaolin Dai, Xing Zhang, Yong |
Author_xml | – sequence: 1 givenname: Shaolin surname: Ran fullname: Ran, Shaolin organization: School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China – sequence: 2 givenname: Beizhen surname: Zhao fullname: Zhao, Beizhen organization: School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China – sequence: 3 givenname: Xing surname: Dai fullname: Dai, Xing organization: School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China – sequence: 4 givenname: Cheng surname: Cheng fullname: Cheng, Cheng organization: School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China – sequence: 5 givenname: Yong surname: Zhang fullname: Zhang, Yong email: zhangyong77@wust.edu.cn organization: School of Information Science and Engineering, Wuhan University of Science and Technology, Wuhan 430081, China |
BookMark | eNqFkM1KAzEUhYNUsK2-gYu8QMb8zK8LoRRbhYKCug6Z5KZNnSYlM23x7Z2hrlzo6p7F_Q6cb4JGPnhA6JbRhFGW320TDwcddgmnXCSUJ1TwCzRmZcFJyct8hMa04hnhgvErNGnbLaWsYLwaI_sWbHdSEchGRTMErAMx0Lq1xzZErLSGBqLqnF_jRsU1kFarBvA6qv2mf_bH0Bw6F7xqsIfuFOIndt5CBK8BB48Xr8vZNbq0qmnh5udO0cfi8X3-RFYvy-f5bEW0oHlHCpFnJqtSXmZUlxmrakErI8DYympdVznVaapMWbNciTpn3NCaWwClWF1kvBBTlJ57dQxtG8HKfXQ7Fb8ko3JwJbfy7EoOriTlsnfVY_e_MO06NYzqonLNf_DDGYZ-2NFBlK12w3jjIuhOmuD-LvgGfoeL9w |
CitedBy_id | crossref_primary_10_1016_j_neucom_2023_127210 crossref_primary_10_15446_dyna_v90n226_107112 crossref_primary_10_1080_02564602_2023_2297355 crossref_primary_10_3390_info15070377 |
Cites_doi | 10.1109/TNSE.2022.3144484 10.1109/TPDS.2019.2910068 10.1007/s13278-016-0332-2 10.1016/j.neucom.2022.04.073 10.1109/LCA.2020.2970395 10.1016/j.neucom.2019.09.074 10.1016/j.neucom.2021.11.006 10.1016/j.sysarc.2021.102122 10.1109/ASAP49362.2020.00019 10.1093/nsr/nwz190 10.1145/3550075 10.1080/00207721.2020.1868615 10.1515/9781400841356.183 10.1016/j.neucom.2021.12.100 10.1016/j.neucom.2020.04.018 10.1016/j.ymssp.2022.109573 10.1109/TMECH.2020.2971503 10.1016/j.neucom.2022.06.076 10.1080/00207721.2021.1998722 10.1109/HPCA47549.2020.00012 10.1007/s13042-021-01285-w 10.1109/TC.2020.3014632 10.1109/TCSVT.2020.3020569 10.1145/3477141 10.1016/j.neucom.2021.06.065 10.1109/TAC.2021.3081256 10.1016/j.ress.2021.108263 10.1109/LCA.2017.2762308 |
ContentType | Journal Article |
Copyright | 2023 Elsevier B.V. |
Copyright_xml | – notice: 2023 Elsevier B.V. |
DBID | AAYXX CITATION |
DOI | 10.1016/j.neucom.2023.02.032 |
DatabaseName | CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISSN | 1872-8286 |
EndPage | 140 |
ExternalDocumentID | 10_1016_j_neucom_2023_02_032 S0925231223001698 |
GroupedDBID | --- --K --M .DC .~1 0R~ 123 1B1 1~. 1~5 4.4 457 4G. 53G 5VS 7-5 71M 8P~ 9JM 9JN AABNK AACTN AADPK AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAXLA AAXUO AAYFN ABBOA ABCQJ ABFNM ABJNI ABMAC ABYKQ ACDAQ ACGFS ACRLP ACZNC ADBBV ADEZE AEBSH AEKER AENEX AFKWA AFTJW AFXIZ AGHFR AGUBO AGWIK AGYEJ AHHHB AHZHX AIALX AIEXJ AIKHN AITUG AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD AXJTR BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EO8 EO9 EP2 EP3 F5P FDB FIRID FNPLU FYGXN G-Q GBLVA GBOLZ IHE J1W KOM LG9 M41 MO0 MOBAO N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 ROL RPZ SDF SDG SDP SES SEW SPC SPCBC SSN SSV SSZ T5K ZMT ~G- 29N AAQXK AATTM AAXKI AAYWO AAYXX ABWVN ABXDB ACNNM ACRPL ACVFH ADCNI ADJOM ADMUD ADNMO AEIPS AEUPX AFJKZ AFPUW AGCQF AGQPQ AGRNS AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP ASPBG AVWKF AZFZN BNPGV CITATION EJD FEDTE FGOYB HLZ HVGLF HZ~ R2- RIG SBC SSH WUQ XPP |
ID | FETCH-LOGICAL-c306t-7365d5942850c8519b309d3edf9fccb960c44ad8b16a3b612d0b2feeaa1b75273 |
IEDL.DBID | .~1 |
ISSN | 0925-2312 |
IngestDate | Thu Apr 24 23:11:11 EDT 2025 Tue Jul 01 04:24:52 EDT 2025 Fri Feb 23 02:37:57 EST 2024 |
IsPeerReviewed | true |
IsScholarly | true |
Keywords | Software-hardware co-design Graph convolutional network Graph sparsification FPGA Hardware acceleration |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c306t-7365d5942850c8519b309d3edf9fccb960c44ad8b16a3b612d0b2feeaa1b75273 |
PageCount | 12 |
ParticipantIDs | crossref_primary_10_1016_j_neucom_2023_02_032 crossref_citationtrail_10_1016_j_neucom_2023_02_032 elsevier_sciencedirect_doi_10_1016_j_neucom_2023_02_032 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2023-05-01 2023-05-00 |
PublicationDateYYYYMMDD | 2023-05-01 |
PublicationDate_xml | – month: 05 year: 2023 text: 2023-05-01 day: 01 |
PublicationDecade | 2020 |
PublicationTitle | Neurocomputing (Amsterdam) |
PublicationYear | 2023 |
Publisher | Elsevier B.V |
Publisher_xml | – name: Elsevier B.V |
References | Cuthill, McKee (b0255) 1969 Tian, Ma, Yang, Dai (b0100) 2020 Zhu, Liu, Hou, Sun, Zheng (b0165) 2021; 457 P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, Y. Bengio, Graph attention networks, arXiv preprint arXiv:1710.10903 (2017). Zhou, Kannan, Prasanna, Seetharaman, Wu (b0175) 2019; 30 Xiong (b0105) 2020 S. Brody, U. Alon, E. Yahav, How attentive are graph attention networks?, arXiv preprint arXiv:2105.14491 (2021). Geng, Li, Shi, Wu, Wang, Li, Haghi, Tumeo, Che, Reinhardt (b0135) 2020 A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, J. Wiener, Graph structure in the web, in: The Structure and Dynamics of Networks, Princeton University Press, 2011, pp. 183–194. Hamann, Lindner, Meyerhenke, Staudt, Wagner (b0260) 2016; 6 Yan, Chen, Deng, Ye, Zhang, Fan, Xie (b0125) 2020; 19 Chen, Xu, Huang, Deng, Huang, Wang, He, Li (b0235) 2020 Zhang, Xin, Liu, Chi, Ma (b0015) 2022; 220 Gao, He, Dong, Liu, Lyu (b0065) 2022 S. Abadal, A. Jain, R. Guirado, J. López-Alonso, E. Alarcón, Computing graph neural networks: A survey from algorithms to accelerators, arXiv preprint arXiv:2010.00130 (2020). Wang, Zhang, Liu, Gu, Jing, Liu (b0085) 2021; 121 Nguyen, Kim, Lee (b0160) 2021; 31 Zhang, Kannan, Prasanna (b0210) 2021 Y. Rong, W. Huang, T. Xu, J. Huang, DropEdge: Towards deep graph convolutional networks on node classification, in: International Conference on Learning Representations, 2020. M. Fey, J.E. Lenssen, Fast graph representation learning with PyTorch Geometric, arXiv preprint arXiv:1903.02428 (2019). Yuan, Ma, Cheng, Zhou, Zhao, Zhang, Ding (b0025) 2020; 7 Mao, Sun, Yi, Liu, Ding (b0040) 2021; 52 Liang, Wang, Liu, He, Li, Xu, Li (b0130) 2021; 70 B. Zhang, H. Zeng, V. Prasanna, Hardware acceleration of large scale GCN inference, in: IEEE 31st International Conference on Application-specific Systems, Architectures and Processors, 2020, pp. 61–68. Wang, Zhu, Chi, Xu (b0170) 2021; 116 Auten, Tomei, Kumar (b0200) 2020 Zou, Wang, Hu, Dong (b0045) 2021; 67 Wang, Fang, Zhang, Xiang, Pan (b0090) 2022; 471 Ma, Wang, Liu, Alsaadi, Alsaadi (b0050) 2022; 9 Chervyakov, Lyakhov, Deryabin, Nagornov, Valueva, Valuev (b0155) 2020; 407 Wen, Jiang, Xu, Wang, Xiao, Zhao, Dou (b0225) 2021 Zhang, Leng, Ma, Miao, Li, Guo (b0110) 2020; 19 Pei, Wang, Qin, Liang (b0220) 2021 Ma, Yang, Miao, Xue, Wu, Zhou, Dai (b0195) 2019 Xu, Dong, Pan, Chen (b0230) 2018; 45 Cheng, Ma, Zhang, Sun, Teng, Ding, Yuan (b0035) 2020; 25 Zhao, He, Ma, Liu (b0055) 2022; 493 Kipf, Welling (b0070) 2017 M. Yan, L. Deng, X. Hu, L. Liang, Y. Feng, X. Ye, Z. Zhang, D. Fan, Y. Xie, HyGCN: A GCN accelerator with hybrid architecture, in: IEEE International Symposium on High Performance Computer Architecture, 2020, pp. 15–29. Zhu, Cheng, Luo, Yang, Luo, Qian, Zhou (b0075) 2022; 494 Wang (b0190) 2019 Sadi, Sweeney, Low, Hoe, Pileggi, Franchetti (b0180) 2019 Chung, Kim, Wen, Cong (b0120) 2012 Ju, Tian, Liu, Ma (b0060) 2021; 52 Kiningham, Levis, Ré (b0205) 2022 Z. Tao, C. Wu, Y. Liang, L. He, LW-GCN: A lightweight FPGA-based graph convolutional network accelerator, arXiv preprint arXiv:2111.03184 (2021). Li, Louri, Karanth, Bunescu (b0150) 2021 Xue, Zhang, Cheng, Ma (b0030) 2020; 376 Y. Zhang, J. Sun, J. Zhang, H. Shen, Y. She, Y. Chang, Health state assessment of bearing with feature enhancement and prediction error compensation strategy, Mechanical Systems and Signal Processing 182 109573. Luo, Cheng, Yu, Zong, Ni, Chen, Zhang (b0240) 2021 Ma, Ren, Khailany, Sikka, Luo, Natarajan, Yu (b0080) 2019 Zhang, Su, Liu, Tan, Jiang, Cheng (b0010) 2022; 503 Liu, Wang, Zeng, Alsaadi, Liu (b0005) 2021; 12 Zhang (10.1016/j.neucom.2023.02.032_b0015) 2022; 220 10.1016/j.neucom.2023.02.032_b0250 10.1016/j.neucom.2023.02.032_b0095 Gao (10.1016/j.neucom.2023.02.032_b0065) 2022 Xue (10.1016/j.neucom.2023.02.032_b0030) 2020; 376 Chen (10.1016/j.neucom.2023.02.032_b0235) 2020 Wang (10.1016/j.neucom.2023.02.032_b0190) 2019 Ma (10.1016/j.neucom.2023.02.032_b0080) 2019 Auten (10.1016/j.neucom.2023.02.032_b0200) 2020 Wang (10.1016/j.neucom.2023.02.032_b0085) 2021; 121 10.1016/j.neucom.2023.02.032_b0215 Cheng (10.1016/j.neucom.2023.02.032_b0035) 2020; 25 10.1016/j.neucom.2023.02.032_b0020 Zhu (10.1016/j.neucom.2023.02.032_b0165) 2021; 457 10.1016/j.neucom.2023.02.032_b0185 10.1016/j.neucom.2023.02.032_b0265 10.1016/j.neucom.2023.02.032_b0145 Luo (10.1016/j.neucom.2023.02.032_b0240) 2021 Zhang (10.1016/j.neucom.2023.02.032_b0010) 2022; 503 Ma (10.1016/j.neucom.2023.02.032_b0050) 2022; 9 Wen (10.1016/j.neucom.2023.02.032_b0225) 2021 Yuan (10.1016/j.neucom.2023.02.032_b0025) 2020; 7 10.1016/j.neucom.2023.02.032_b0140 Chervyakov (10.1016/j.neucom.2023.02.032_b0155) 2020; 407 Wang (10.1016/j.neucom.2023.02.032_b0090) 2022; 471 Tian (10.1016/j.neucom.2023.02.032_b0100) 2020 Yan (10.1016/j.neucom.2023.02.032_b0125) 2020; 19 Chung (10.1016/j.neucom.2023.02.032_b0120) 2012 Kipf (10.1016/j.neucom.2023.02.032_b0070) 2017 Zhu (10.1016/j.neucom.2023.02.032_b0075) 2022; 494 Xiong (10.1016/j.neucom.2023.02.032_b0105) 2020 10.1016/j.neucom.2023.02.032_b0115 Liu (10.1016/j.neucom.2023.02.032_b0005) 2021; 12 Zhou (10.1016/j.neucom.2023.02.032_b0175) 2019; 30 Nguyen (10.1016/j.neucom.2023.02.032_b0160) 2021; 31 Wang (10.1016/j.neucom.2023.02.032_b0170) 2021; 116 Xu (10.1016/j.neucom.2023.02.032_b0230) 2018; 45 Mao (10.1016/j.neucom.2023.02.032_b0040) 2021; 52 Sadi (10.1016/j.neucom.2023.02.032_b0180) 2019 Geng (10.1016/j.neucom.2023.02.032_b0135) 2020 Cuthill (10.1016/j.neucom.2023.02.032_b0255) 1969 Zou (10.1016/j.neucom.2023.02.032_b0045) 2021; 67 Zhao (10.1016/j.neucom.2023.02.032_b0055) 2022; 493 10.1016/j.neucom.2023.02.032_b0245 Zhang (10.1016/j.neucom.2023.02.032_b0110) 2020; 19 Liang (10.1016/j.neucom.2023.02.032_b0130) 2021; 70 Zhang (10.1016/j.neucom.2023.02.032_b0210) 2021 Hamann (10.1016/j.neucom.2023.02.032_b0260) 2016; 6 Ma (10.1016/j.neucom.2023.02.032_b0195) 2019 Li (10.1016/j.neucom.2023.02.032_b0150) 2021 Pei (10.1016/j.neucom.2023.02.032_b0220) 2021 Ju (10.1016/j.neucom.2023.02.032_b0060) 2021; 52 Kiningham (10.1016/j.neucom.2023.02.032_b0205) 2022 |
References_xml | – volume: 471 start-page: 118 year: 2022 end-page: 129 ident: b0090 article-title: TVGCN: Time-variant graph convolutional network for traffic forecasting publication-title: Neurocomputing – volume: 121 year: 2021 ident: b0085 article-title: A novel GCN-based point cloud classification model robust to pose variances publication-title: Pattern Recognition – reference: Y. Rong, W. Huang, T. Xu, J. Huang, DropEdge: Towards deep graph convolutional networks on node classification, in: International Conference on Learning Representations, 2020. – reference: M. Yan, L. Deng, X. Hu, L. Liang, Y. Feng, X. Ye, Z. Zhang, D. Fan, Y. Xie, HyGCN: A GCN accelerator with hybrid architecture, in: IEEE International Symposium on High Performance Computer Architecture, 2020, pp. 15–29. – start-page: 1 year: 2022 end-page: 14 ident: b0065 article-title: A survey on fault-tolerant consensus control of multi-agent systems: trends, methodologies and prospects publication-title: International Journal of Systems Science – start-page: 1 year: 2019 end-page: 6 ident: b0080 article-title: High performance graph convolutionai networks with applications in testability analysis publication-title: 56th ACM/IEEE Design Automation Conference – volume: 70 start-page: 1511 year: 2021 end-page: 1525 ident: b0130 article-title: EnGN, A high-throughput and energy-efficient accelerator for large graph neural networks publication-title: IEEE Transactions on Computers – start-page: 1 year: 2012 end-page: 8 ident: b0120 article-title: Application data prefetching on the IBM blue Gene/Q supercomputer publication-title: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis – start-page: 922 year: 2020 end-page: 936 ident: b0135 article-title: AWB-GCN: A graph convolutional network accelerator with runtime workload rebalancing publication-title: 53rd Annual IEEE/ACM International Symposium on Microarchitecture – year: 2022 ident: b0205 article-title: GRIP: A graph neural network accelerator architecture publication-title: IEEE Transactions on Computers – start-page: 775 year: 2021 end-page: 788 ident: b0150 article-title: GCNAX: A flexible and energy-efficient accelerator for graph convolutional neural networks publication-title: 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA) – volume: 6 start-page: 1 year: 2016 end-page: 22 ident: b0260 article-title: Structure-preserving sparsification methods for social networks publication-title: Social Network Analysis and Mining – year: 2019 ident: b0190 article-title: Deep graph library: Towards efficient and scalable deep learning on graphs publication-title: ICLR workshop on representation learning on graphs and manifolds – reference: P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, Y. Bengio, Graph attention networks, arXiv preprint arXiv:1710.10903 (2017). – year: 2017 ident: b0070 article-title: Semi-supervised classification with graph convolutional networks publication-title: Proceedings of 5th International Conference on Learning Representations – volume: 52 start-page: 3390 year: 2021 end-page: 3409 ident: b0060 article-title: Fault detection of networked dynamical systems: A survey of trends and techniques publication-title: International Journal of Systems Science – start-page: 1 year: 2020 end-page: 6 ident: b0200 article-title: Hardware acceleration of graph neural networks publication-title: 57th ACM/IEEE Design Automation Conference – volume: 9 start-page: 1395 year: 2022 end-page: 1408 ident: b0050 article-title: Neural-network-based filtering for a general class of nonlinear systems under dynamically bounded innovations over sensor networks publication-title: IEEE Transactions on Network Science and Engineering – volume: 25 start-page: 1243 year: 2020 end-page: 1254 ident: b0035 article-title: A deep learning-based remaining useful life prediction approach for bearings publication-title: IEEE/ASME transactions on mechatronics – reference: A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, J. Wiener, Graph structure in the web, in: The Structure and Dynamics of Networks, Princeton University Press, 2011, pp. 183–194. – volume: 19 start-page: 22 year: 2020 end-page: 25 ident: b0125 article-title: Characterizing and understanding GCNs on GPU publication-title: IEEE Computer Architecture Letters – reference: M. Fey, J.E. Lenssen, Fast graph representation learning with PyTorch Geometric, arXiv preprint arXiv:1903.02428 (2019). – start-page: 779 year: 2021 end-page: 787 ident: b0240 article-title: Learning to drop: Robust graph neural network via topological denoising publication-title: Proceedings of the 14th ACM International Conference on Web Search and Data Mining – volume: 376 start-page: 95 year: 2020 end-page: 102 ident: b0030 article-title: Remaining useful life prediction of lithium-ion batteries with adaptive unscented kalman filter and optimized support vector regression publication-title: Neurocomputing – reference: Y. Zhang, J. Sun, J. Zhang, H. Shen, Y. She, Y. Chang, Health state assessment of bearing with feature enhancement and prediction error compensation strategy, Mechanical Systems and Signal Processing 182 109573. – reference: S. Brody, U. Alon, E. Yahav, How attentive are graph attention networks?, arXiv preprint arXiv:2105.14491 (2021). – volume: 503 start-page: 314 year: 2022 end-page: 324 ident: b0010 article-title: A semi-supervised learning approach for COVID-19 detection from chest CT scans publication-title: Neurocomputing – volume: 7 start-page: 418 year: 2020 end-page: 429 ident: b0025 article-title: A general end-to-end diagnosis framework for manufacturing systems publication-title: National Science Review – volume: 52 start-page: 1110 year: 2021 end-page: 1128 ident: b0040 article-title: Recursive filtering of networked nonlinear systems: A survey publication-title: International Journal of Systems Science – volume: 493 start-page: 583 year: 2022 end-page: 591 ident: b0055 article-title: Estimator-based iterative deviation-free residual generator for fault detection under random access protocol publication-title: Neurocomputing – reference: S. Abadal, A. Jain, R. Guirado, J. López-Alonso, E. Alarcón, Computing graph neural networks: A survey from algorithms to accelerators, arXiv preprint arXiv:2010.00130 (2020). – start-page: 936 year: 2020 end-page: 945 ident: b0100 article-title: Pcgcn: Partition-centric processing for accelerating graph convolutional network publication-title: IEEE International Parallel and Distributed Processing Symposium – start-page: 33 year: 2021 end-page: 40 ident: b0225 article-title: RFC-HyPGCN: A runtime sparse feature compress accelerator for skeleton-based GCNs action recognition model with hybrid pruning publication-title: 2021 IEEE 32nd International Conference on Application-specific Systems, Architectures and Processors (ASAP) – start-page: 1469 year: 2021 end-page: 1474 ident: b0220 article-title: STARS: Spatial temporal graph convolution network for action recognition system on FPGAs publication-title: 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC) – volume: 19 start-page: 59 year: 2020 end-page: 62 ident: b0110 article-title: Architectural implications of graph neural networks publication-title: IEEE Computer architecture letters – volume: 31 start-page: 2450 year: 2021 end-page: 2464 ident: b0160 article-title: Layer-specific optimization for mixed data flow with mixed precision in FPGA design for CNN-based object detectors publication-title: IEEE Transactions on Circuits and Systems for Video Technology – start-page: 157 year: 1969 end-page: 172 ident: b0255 article-title: Reducing the bandwidth of sparse symmetric matrices publication-title: Proceedings of the 24th National Conference – volume: 407 start-page: 439 year: 2020 end-page: 453 ident: b0155 article-title: Residue number system-based solution for reducing the hardware cost of a convolutional neural network publication-title: Neurocomputing – start-page: 443 year: 2019 end-page: 458 ident: b0195 article-title: Neugraph: parallel deep neural network computation on large graphs publication-title: Annual Technical Conference – start-page: 347 year: 2019 end-page: 358 ident: b0180 article-title: Efficient spmv operation for large and highly sparse matrices using scalable multi-way merge parallelization publication-title: Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture – volume: 67 start-page: 304 year: 2021 end-page: 319 ident: b0045 article-title: Ultimately bounded filtering subject to impulsive measurement outliers publication-title: IEEE Transactions on Automatic Control – volume: 220 year: 2022 ident: b0015 article-title: Health status assessment and remaining useful life prediction of aero-engine based on BiGRU and MMoE publication-title: Reliability Engineering & System Safety – volume: 30 start-page: 2249 year: 2019 end-page: 2264 ident: b0175 article-title: Hitgraph: High-throughput graph processing framework on FPGA publication-title: IEEE Transactions on Parallel and Distributed Systems – reference: B. Zhang, H. Zeng, V. Prasanna, Hardware acceleration of large scale GCN inference, in: IEEE 31st International Conference on Application-specific Systems, Architectures and Processors, 2020, pp. 61–68. – volume: 12 start-page: 1939 year: 2021 end-page: 1948 ident: b0005 article-title: A PSO-based deep learning approach to classifying patients from emergency departments publication-title: International Journal of Machine Learning and Cybernetics – start-page: 92 year: 2020 end-page: 96 ident: b0105 article-title: A survey of FPGA based on graph convolutional neural network accelerator publication-title: International Conference on Computer Engineering and Intelligent Control – start-page: 29 year: 2021 end-page: 39 ident: b0210 article-title: BoostGCN: A framework for optimizing GCN inference on FPGA publication-title: IEEE 29th Annual International Symposium on Field-Programmable Custom Computing Machines – volume: 457 start-page: 141 year: 2021 end-page: 154 ident: b0165 article-title: HSC: Leveraging horizontal shortcut connections for improving accuracy and computational efficiency of lightweight CNN publication-title: Neurocomputing – volume: 494 start-page: 33 year: 2022 end-page: 42 ident: b0075 article-title: SI-News: Integrating social information for news recommendation with attention-based graph convolutional network publication-title: Neurocomputing – volume: 45 start-page: 24 year: 2018 end-page: 30 ident: b0230 article-title: Survey of graph sparsification algorithms for complex networks publication-title: Computer Science – start-page: 1977 year: 2020 end-page: 1980 ident: b0235 article-title: Label-aware graph convolutional networks publication-title: Proceedings of the 29th ACM International Conference on Information & Knowledge Management – reference: Z. Tao, C. Wu, Y. Liang, L. He, LW-GCN: A lightweight FPGA-based graph convolutional network accelerator, arXiv preprint arXiv:2111.03184 (2021). – volume: 116 year: 2021 ident: b0170 article-title: S-CNN-ESystem: An end-to-end embedded CNN inference system with low hardware cost and hardware-software time-balancing publication-title: Journal of Systems Architecture – volume: 9 start-page: 1395 issue: 3 year: 2022 ident: 10.1016/j.neucom.2023.02.032_b0050 article-title: Neural-network-based filtering for a general class of nonlinear systems under dynamically bounded innovations over sensor networks publication-title: IEEE Transactions on Network Science and Engineering doi: 10.1109/TNSE.2022.3144484 – volume: 30 start-page: 2249 issue: 10 year: 2019 ident: 10.1016/j.neucom.2023.02.032_b0175 article-title: Hitgraph: High-throughput graph processing framework on FPGA publication-title: IEEE Transactions on Parallel and Distributed Systems doi: 10.1109/TPDS.2019.2910068 – start-page: 1 year: 2019 ident: 10.1016/j.neucom.2023.02.032_b0080 article-title: High performance graph convolutionai networks with applications in testability analysis – volume: 6 start-page: 1 issue: 1 year: 2016 ident: 10.1016/j.neucom.2023.02.032_b0260 article-title: Structure-preserving sparsification methods for social networks publication-title: Social Network Analysis and Mining doi: 10.1007/s13278-016-0332-2 – start-page: 33 year: 2021 ident: 10.1016/j.neucom.2023.02.032_b0225 article-title: RFC-HyPGCN: A runtime sparse feature compress accelerator for skeleton-based GCNs action recognition model with hybrid pruning – volume: 494 start-page: 33 year: 2022 ident: 10.1016/j.neucom.2023.02.032_b0075 article-title: SI-News: Integrating social information for news recommendation with attention-based graph convolutional network publication-title: Neurocomputing doi: 10.1016/j.neucom.2022.04.073 – start-page: 347 year: 2019 ident: 10.1016/j.neucom.2023.02.032_b0180 article-title: Efficient spmv operation for large and highly sparse matrices using scalable multi-way merge parallelization – volume: 19 start-page: 22 issue: 1 year: 2020 ident: 10.1016/j.neucom.2023.02.032_b0125 article-title: Characterizing and understanding GCNs on GPU publication-title: IEEE Computer Architecture Letters doi: 10.1109/LCA.2020.2970395 – start-page: 1 year: 2020 ident: 10.1016/j.neucom.2023.02.032_b0200 article-title: Hardware acceleration of graph neural networks – volume: 376 start-page: 95 year: 2020 ident: 10.1016/j.neucom.2023.02.032_b0030 article-title: Remaining useful life prediction of lithium-ion batteries with adaptive unscented kalman filter and optimized support vector regression publication-title: Neurocomputing doi: 10.1016/j.neucom.2019.09.074 – start-page: 92 year: 2020 ident: 10.1016/j.neucom.2023.02.032_b0105 article-title: A survey of FPGA based on graph convolutional neural network accelerator – volume: 471 start-page: 118 year: 2022 ident: 10.1016/j.neucom.2023.02.032_b0090 article-title: TVGCN: Time-variant graph convolutional network for traffic forecasting publication-title: Neurocomputing doi: 10.1016/j.neucom.2021.11.006 – year: 2022 ident: 10.1016/j.neucom.2023.02.032_b0205 article-title: GRIP: A graph neural network accelerator architecture publication-title: IEEE Transactions on Computers – year: 2017 ident: 10.1016/j.neucom.2023.02.032_b0070 article-title: Semi-supervised classification with graph convolutional networks – start-page: 157 year: 1969 ident: 10.1016/j.neucom.2023.02.032_b0255 article-title: Reducing the bandwidth of sparse symmetric matrices – volume: 116 year: 2021 ident: 10.1016/j.neucom.2023.02.032_b0170 article-title: S-CNN-ESystem: An end-to-end embedded CNN inference system with low hardware cost and hardware-software time-balancing publication-title: Journal of Systems Architecture doi: 10.1016/j.sysarc.2021.102122 – ident: 10.1016/j.neucom.2023.02.032_b0265 – ident: 10.1016/j.neucom.2023.02.032_b0215 doi: 10.1109/ASAP49362.2020.00019 – start-page: 1 year: 2012 ident: 10.1016/j.neucom.2023.02.032_b0120 article-title: Application data prefetching on the IBM blue Gene/Q supercomputer – volume: 7 start-page: 418 issue: 2 year: 2020 ident: 10.1016/j.neucom.2023.02.032_b0025 article-title: A general end-to-end diagnosis framework for manufacturing systems publication-title: National Science Review doi: 10.1093/nsr/nwz190 – start-page: 922 year: 2020 ident: 10.1016/j.neucom.2023.02.032_b0135 article-title: AWB-GCN: A graph convolutional network accelerator with runtime workload rebalancing – ident: 10.1016/j.neucom.2023.02.032_b0250 – ident: 10.1016/j.neucom.2023.02.032_b0145 doi: 10.1145/3550075 – start-page: 443 year: 2019 ident: 10.1016/j.neucom.2023.02.032_b0195 article-title: Neugraph: parallel deep neural network computation on large graphs – volume: 45 start-page: 24 issue: 5 year: 2018 ident: 10.1016/j.neucom.2023.02.032_b0230 article-title: Survey of graph sparsification algorithms for complex networks publication-title: Computer Science – start-page: 936 year: 2020 ident: 10.1016/j.neucom.2023.02.032_b0100 article-title: Pcgcn: Partition-centric processing for accelerating graph convolutional network – start-page: 1977 year: 2020 ident: 10.1016/j.neucom.2023.02.032_b0235 article-title: Label-aware graph convolutional networks – start-page: 775 year: 2021 ident: 10.1016/j.neucom.2023.02.032_b0150 article-title: GCNAX: A flexible and energy-efficient accelerator for graph convolutional neural networks – volume: 52 start-page: 1110 issue: 6 year: 2021 ident: 10.1016/j.neucom.2023.02.032_b0040 article-title: Recursive filtering of networked nonlinear systems: A survey publication-title: International Journal of Systems Science doi: 10.1080/00207721.2020.1868615 – ident: 10.1016/j.neucom.2023.02.032_b0115 doi: 10.1515/9781400841356.183 – volume: 493 start-page: 583 year: 2022 ident: 10.1016/j.neucom.2023.02.032_b0055 article-title: Estimator-based iterative deviation-free residual generator for fault detection under random access protocol publication-title: Neurocomputing doi: 10.1016/j.neucom.2021.12.100 – volume: 407 start-page: 439 year: 2020 ident: 10.1016/j.neucom.2023.02.032_b0155 article-title: Residue number system-based solution for reducing the hardware cost of a convolutional neural network publication-title: Neurocomputing doi: 10.1016/j.neucom.2020.04.018 – ident: 10.1016/j.neucom.2023.02.032_b0245 – ident: 10.1016/j.neucom.2023.02.032_b0020 doi: 10.1016/j.ymssp.2022.109573 – volume: 25 start-page: 1243 issue: 3 year: 2020 ident: 10.1016/j.neucom.2023.02.032_b0035 article-title: A deep learning-based remaining useful life prediction approach for bearings publication-title: IEEE/ASME transactions on mechatronics doi: 10.1109/TMECH.2020.2971503 – start-page: 1469 year: 2021 ident: 10.1016/j.neucom.2023.02.032_b0220 article-title: STARS: Spatial temporal graph convolution network for action recognition system on FPGAs – volume: 503 start-page: 314 year: 2022 ident: 10.1016/j.neucom.2023.02.032_b0010 article-title: A semi-supervised learning approach for COVID-19 detection from chest CT scans publication-title: Neurocomputing doi: 10.1016/j.neucom.2022.06.076 – volume: 52 start-page: 3390 issue: 16 year: 2021 ident: 10.1016/j.neucom.2023.02.032_b0060 article-title: Fault detection of networked dynamical systems: A survey of trends and techniques publication-title: International Journal of Systems Science doi: 10.1080/00207721.2021.1998722 – ident: 10.1016/j.neucom.2023.02.032_b0140 doi: 10.1109/HPCA47549.2020.00012 – volume: 12 start-page: 1939 issue: 7 year: 2021 ident: 10.1016/j.neucom.2023.02.032_b0005 article-title: A PSO-based deep learning approach to classifying patients from emergency departments publication-title: International Journal of Machine Learning and Cybernetics doi: 10.1007/s13042-021-01285-w – volume: 70 start-page: 1511 issue: 9 year: 2021 ident: 10.1016/j.neucom.2023.02.032_b0130 article-title: EnGN, A high-throughput and energy-efficient accelerator for large graph neural networks publication-title: IEEE Transactions on Computers doi: 10.1109/TC.2020.3014632 – volume: 31 start-page: 2450 issue: 6 year: 2021 ident: 10.1016/j.neucom.2023.02.032_b0160 article-title: Layer-specific optimization for mixed data flow with mixed precision in FPGA design for CNN-based object detectors publication-title: IEEE Transactions on Circuits and Systems for Video Technology doi: 10.1109/TCSVT.2020.3020569 – ident: 10.1016/j.neucom.2023.02.032_b0095 doi: 10.1145/3477141 – volume: 457 start-page: 141 year: 2021 ident: 10.1016/j.neucom.2023.02.032_b0165 article-title: HSC: Leveraging horizontal shortcut connections for improving accuracy and computational efficiency of lightweight CNN publication-title: Neurocomputing doi: 10.1016/j.neucom.2021.06.065 – start-page: 29 year: 2021 ident: 10.1016/j.neucom.2023.02.032_b0210 article-title: BoostGCN: A framework for optimizing GCN inference on FPGA – volume: 67 start-page: 304 issue: 1 year: 2021 ident: 10.1016/j.neucom.2023.02.032_b0045 article-title: Ultimately bounded filtering subject to impulsive measurement outliers publication-title: IEEE Transactions on Automatic Control doi: 10.1109/TAC.2021.3081256 – start-page: 779 year: 2021 ident: 10.1016/j.neucom.2023.02.032_b0240 article-title: Learning to drop: Robust graph neural network via topological denoising – ident: 10.1016/j.neucom.2023.02.032_b0185 – volume: 220 year: 2022 ident: 10.1016/j.neucom.2023.02.032_b0015 article-title: Health status assessment and remaining useful life prediction of aero-engine based on BiGRU and MMoE publication-title: Reliability Engineering & System Safety doi: 10.1016/j.ress.2021.108263 – year: 2019 ident: 10.1016/j.neucom.2023.02.032_b0190 article-title: Deep graph library: Towards efficient and scalable deep learning on graphs – volume: 19 start-page: 59 issue: 1 year: 2020 ident: 10.1016/j.neucom.2023.02.032_b0110 article-title: Architectural implications of graph neural networks publication-title: IEEE Computer architecture letters doi: 10.1109/LCA.2017.2762308 – start-page: 1 year: 2022 ident: 10.1016/j.neucom.2023.02.032_b0065 article-title: A survey on fault-tolerant consensus control of multi-agent systems: trends, methodologies and prospects publication-title: International Journal of Systems Science – volume: 121 year: 2021 ident: 10.1016/j.neucom.2023.02.032_b0085 article-title: A novel GCN-based point cloud classification model robust to pose variances publication-title: Pattern Recognition |
SSID | ssj0017129 |
Score | 2.4021244 |
Snippet | Inspired by convolutional neural networks, graph convolutional networks (GCNs) have been proposed for processing non-Euclidean graph data and successfully been... |
SourceID | crossref elsevier |
SourceType | Enrichment Source Index Database Publisher |
StartPage | 129 |
SubjectTerms | FPGA Graph convolutional network Graph sparsification Hardware acceleration Software-hardware co-design |
Title | Software-hardware co-design for accelerating large-scale graph convolutional network inference on FPGA |
URI | https://dx.doi.org/10.1016/j.neucom.2023.02.032 |
Volume | 532 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEF6KXrz4Fuuj7MHr2iSbTbrHUqxVsQi10Nuyu9lIpaSltnjztzuTbIqCKHhLQiYJM5OZb-CbGUKuNHcdkwrBEh3mDKKfZFo4zayNQwsANXUWC8XHYTIYx_cTMWmQXt0Lg7RKH_urmF5Ga3-l7bXZXkyn7VEgI6iiQshviFskNvzGcYpefv2xoXmEaRhV8_YiwfDuun2u5HgVbo2cEVwhXk7u5NHP6elLyunvk12PFWm3-pwD0nDFIdmr9zBQ_1sekXwEsfRdLx3DFio8oHbOspKbQQGUUm0tZBe0dfFCZ8j9Zm9gG0fLcdUUmefeA-F1RUUMp9O6FZDOC9p_uu0ek3H_5rk3YH5_ArNQCKxYyhORCQkFhggsICtpeCAz7rJc5tYaqF1sHOusY8JEcwNQJwtMlDundQjmA1xzQraKeeFOCU2k5SAonDBQkRnTiYWER2sorIWzXDQJr9WmrB8ujjsuZqpmkb2qStkKla2CSIGym4RtpBbVcI0_7k9ri6hvTqIg_v8qefZvyXOyg2cVx_GCbK2Wa3cJOGRlWqWjtch29-5hMPwEmV7ewQ |
linkProvider | Elsevier |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3dS8MwED90PuiL3-L8zIOvYW3TtMvjGM7NjyG4wd5CkqYyGZ3ohv--lzYdCqLgW2l7bblL735HfncHcKWYbeuUc5qoMKfo_QRV3CpqTBwaBKipNS5RfBgm_XF8O-GTNejWtTCOVul9f-XTS2_tz7S8Nluv02nrKRARZlEhxjeHW0R7HTZcdyregI3O4K4_XG0mpGFUtdyLOHUCdQVdSfMq7NLRRtwU8bJ5J4t-jlBfok5vF7Y9XCSd6ov2YM0W-7BTj2Ig_s88gPwJ3emHerPUVVG5A2LmNCvpGQRxKVHGYIBx5i6eyczRv-k7mseSsmM1ceRzvwjxdUXFDSfTuhqQzAvSe7zpHMK4dz3q9qkfoUAN5gILmrKEZ1xgjsEDg-BKaBaIjNksF7kxGtMXE8cqa-swUUwj2skCHeXWKhWiBRHaHEGjmBf2GEgiDENBbrnGpEzrdswFPlphbs2tYbwJrFabNL6_uBtzMZM1kexFVsqWTtkyiCQquwl0JfVa9df44_60toj8tk4khoBfJU_-LXkJm_3Rw728HwzvTmHLXakoj2fQWLwt7TnCkoW-8MvuE0HY4XI |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Software-hardware+co-design+for+accelerating+large-scale+graph+convolutional+network+inference+on+FPGA&rft.jtitle=Neurocomputing+%28Amsterdam%29&rft.au=Ran%2C+Shaolin&rft.au=Zhao%2C+Beizhen&rft.au=Dai%2C+Xing&rft.au=Cheng%2C+Cheng&rft.date=2023-05-01&rft.issn=0925-2312&rft.volume=532&rft.spage=129&rft.epage=140&rft_id=info:doi/10.1016%2Fj.neucom.2023.02.032&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_neucom_2023_02_032 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0925-2312&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0925-2312&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0925-2312&client=summon |