RADAR: An Efficient FPGA-based ResNet Accelerator with Data-aware Reordering of Processing Sequences
The deployment of compact convolutional neural network (CNN) models with skip connections on edge devices through dedicated hardware accelerators is increasingly prevalent. However, optimizing the use of limited on-chip memory (OCM) across multiple CNN layers, especially those with skip connections,...
Saved in:
Published in | Journal of semiconductor technology and science Vol. 25; no. 4; pp. 451 - 458 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
대한전자공학회
31.08.2025
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | The deployment of compact convolutional neural network (CNN) models with skip connections on edge devices through dedicated hardware accelerators is increasingly prevalent. However, optimizing the use of limited on-chip memory (OCM) across multiple CNN layers, especially those with skip connections, remains a challenge.
In this paper, we propose a novel CNN accelerator technique that reorders the computation sequence for each layer to maximize data reuse within the OCM, thereby minimizing DRAM access and improving the utilization of both the OCM and the convolution processor. Additionally, we introduce a shared buffer design that efficiently manages OCM usage across different layers, particularly those involving skip connections. Finally, we present a ResNet-18 accelerator IP, RADAR, implemented with the proposed technique on a Xilinx ZCU102 FPGA. RADAR achieves 64.9 GOPS/W and 446.9 GOPS while maintaining high accuracy, demonstrating significant improvements over prior works in terms of the trade-off between throughput, hardware resource efficiency, and model accuracy. KCI Citation Count: 0 |
---|---|
AbstractList | The deployment of compact convolutional neural network (CNN) models with skip connections on edge devices through dedicated hardware accelerators is increasingly prevalent. However, optimizing the use of limited on-chip memory (OCM) across multiple CNN layers, especially those with skip connections, remains a challenge.
In this paper, we propose a novel CNN accelerator technique that reorders the computation sequence for each layer to maximize data reuse within the OCM, thereby minimizing DRAM access and improving the utilization of both the OCM and the convolution processor. Additionally, we introduce a shared buffer design that efficiently manages OCM usage across different layers, particularly those involving skip connections. Finally, we present a ResNet-18 accelerator IP, RADAR, implemented with the proposed technique on a Xilinx ZCU102 FPGA. RADAR achieves 64.9 GOPS/W and 446.9 GOPS while maintaining high accuracy, demonstrating significant improvements over prior works in terms of the trade-off between throughput, hardware resource efficiency, and model accuracy. KCI Citation Count: 0 |
Author | Choi, Dahun Kim, Hyun Park, Juntae |
Author_xml | – sequence: 1 givenname: Juntae surname: Park fullname: Park, Juntae – sequence: 2 givenname: Dahun surname: Choi fullname: Choi, Dahun – sequence: 3 givenname: Hyun surname: Kim fullname: Kim, Hyun |
BackLink | https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART003232511$$DAccess content in National Research Foundation of Korea (NRF) |
BookMark | eNotkFtLwzAAhYMouE3_gE95FjpzbVLfym5Oho6u7yFNk1k3G00qw39v64QDhwMf5-Ebg8vWtxaAO4ymnAv68Lwrd1OCCJ_2YVPG8QUYEUJpwmSaXoIR5plMcMrFNRjH-I5QKkUmRqAu8nlePMK8hQvnGtPYtoPL7SpPKh1tDQsbX2wHc2Ps0Qbd-QBPTfcG57rTiT7pYHvEh9qGpt1D7-A2eGNjHNbOfn3btl834MrpY7S3_z0B5XJRzp6SzetqPcs3iRESJxmpJccaSSelpbTShInKISskEw6lrjY1qqUwWBhZZRQzwTMueJUJbQThhE7A_fm2DU4dTKO8bv5679UhqLwo1wojwRgSuIfJGTbBxxisU5-h-dDhp0fU4FQNTtXgVPVhqndKfwEhlGp3 |
ContentType | Journal Article |
DBID | AAYXX CITATION ACYCR |
DOI | 10.5573/JSTS.2025.25.4.451 |
DatabaseName | CrossRef Korean Citation Index |
DatabaseTitle | CrossRef |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 2233-4866 |
EndPage | 458 |
ExternalDocumentID | oai_kci_go_kr_ARTI_10744071 10_5573_JSTS_2025_25_4_451 |
GroupedDBID | 9ZL AAYXX ADDVE AENEX ALMA_UNASSIGNED_HOLDINGS CITATION DBRKI FRP GW5 HH5 JDI OK1 TDB TR2 ACYCR C1A KVFHK MZR ZZE |
ID | FETCH-LOGICAL-c781-92d851a08f88e33ba247bf0e7847f06fdcd0d87c17c8b9314759575b97ac72523 |
ISSN | 1598-1657 |
IngestDate | Sun Aug 24 03:18:59 EDT 2025 Wed Aug 27 16:28:55 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | true |
Issue | 4 |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-c781-92d851a08f88e33ba247bf0e7847f06fdcd0d87c17c8b9314759575b97ac72523 |
PageCount | 8 |
ParticipantIDs | nrf_kci_oai_kci_go_kr_ARTI_10744071 crossref_primary_10_5573_JSTS_2025_25_4_451 |
PublicationCentury | 2000 |
PublicationDate | 2025-08-31 |
PublicationDateYYYYMMDD | 2025-08-31 |
PublicationDate_xml | – month: 08 year: 2025 text: 2025-08-31 day: 31 |
PublicationDecade | 2020 |
PublicationTitle | Journal of semiconductor technology and science |
PublicationYear | 2025 |
Publisher | 대한전자공학회 |
Publisher_xml | – name: 대한전자공학회 |
SSID | ssj0068797 |
Score | 2.3288388 |
Snippet | The deployment of compact convolutional neural network (CNN) models with skip connections on edge devices through dedicated hardware accelerators is... |
SourceID | nrf crossref |
SourceType | Open Website Index Database |
StartPage | 451 |
SubjectTerms | 전기공학 |
Title | RADAR: An Efficient FPGA-based ResNet Accelerator with Data-aware Reordering of Processing Sequences |
URI | https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART003232511 |
Volume | 25 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
ispartofPNX | JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, 2025, 25(4), 124, pp.451-458 |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3db9MwELe68QIPiE8xvmQJ8lQlpIkT27ylbUYBMU2lSHuznMTZqokUdakQ_GH8fdw5aZrBQAMpilw3qdy7n3x39t3PhLw0pR8XWYxlHzEEKIZxV5Yic7kWAvBkSm55Zj8cxbNP7N1JdDIY_OhlLW3qzMu_X1lX8j9ahT7QK1bJ_oNmux-FDmiDfuEOGob7tXQ8T6bJvF3aSy0XBO7sHx6_SVy0Tlh5eHFkajwSAqyL3VBvFl6nutau_opZX3Nj2Tfb5Oe2bsDSdG-TrP_gv15gWv2qQr5YzFTslujtdkRrWLv8nE1VazM87hUGTfXZphpOzlbLDlnfoOP98nN_ISKItiurLXScdOyICeZnpFNHRo6cOOnESXxHMGzI1JHwVeKMQ2ccbZ-RtpHAi_05WEJgGze81Z6xfeDEhC4TcX-uZS1TrWk_iassQhRxZKaAWe6jh4P24GJe92qffvsXs3iJgPs8X6rTlTpfKwgz3irMYsVIeI_cCCA8sQmlsy7sigVvDvXZ_pGmWAuH8ur3gVxyiPaqddnzbxZ3yO1WsTRpUHaXDEx1j9zq0VXeJ4XF22uaVLRDG92hjTZooz20UUQb3aGN7tBGVyXdoY12aHtAFofpYjJz21M63JyLkSuDApx27YtSCBOGmQ4Yz0rfcHB7YB4oi7zwC8HzEc9FJsMR8ktCiJBJrnMeREH4kOxXq8o8ItQUesSFZMJowZDnEYIPzUJdyhGeoZYdkOFWUupLw8UCmlAoV4VyVShXBRdTINcD8gKEaTX3Fw0-vtZTT8jNHeKfkv16vTHPwBmts-dW8z8B9gJ_ag |
linkProvider | ABC ChemistRy |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=RADAR%3A+An+Efficient+FPGA-based+ResNet+Accelerator+with+Data-aware+Reordering+of+Processing+Sequences&rft.jtitle=Journal+of+semiconductor+technology+and+science&rft.au=Juntae+Park&rft.au=Dahun+Choi&rft.au=Hyun+Kim&rft.date=2025-08-31&rft.pub=%EB%8C%80%ED%95%9C%EC%A0%84%EC%9E%90%EA%B3%B5%ED%95%99%ED%9A%8C&rft.issn=1598-1657&rft.eissn=2233-4866&rft.spage=451&rft.epage=458&rft_id=info:doi/10.5573%2FJSTS.2025.25.4.451&rft.externalDBID=n%2Fa&rft.externalDocID=oai_kci_go_kr_ARTI_10744071 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1598-1657&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1598-1657&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1598-1657&client=summon |