RADAR: An Efficient FPGA-based ResNet Accelerator with Data-aware Reordering of Processing Sequences

The deployment of compact convolutional neural network (CNN) models with skip connections on edge devices through dedicated hardware accelerators is increasingly prevalent. However, optimizing the use of limited on-chip memory (OCM) across multiple CNN layers, especially those with skip connections,...

Full description

Saved in:
Bibliographic Details
Published inJournal of semiconductor technology and science Vol. 25; no. 4; pp. 451 - 458
Main Authors Park, Juntae, Choi, Dahun, Kim, Hyun
Format Journal Article
LanguageEnglish
Published 대한전자공학회 31.08.2025
Subjects
Online AccessGet full text

Cover

Loading…
Abstract The deployment of compact convolutional neural network (CNN) models with skip connections on edge devices through dedicated hardware accelerators is increasingly prevalent. However, optimizing the use of limited on-chip memory (OCM) across multiple CNN layers, especially those with skip connections, remains a challenge. In this paper, we propose a novel CNN accelerator technique that reorders the computation sequence for each layer to maximize data reuse within the OCM, thereby minimizing DRAM access and improving the utilization of both the OCM and the convolution processor. Additionally, we introduce a shared buffer design that efficiently manages OCM usage across different layers, particularly those involving skip connections. Finally, we present a ResNet-18 accelerator IP, RADAR, implemented with the proposed technique on a Xilinx ZCU102 FPGA. RADAR achieves 64.9 GOPS/W and 446.9 GOPS while maintaining high accuracy, demonstrating significant improvements over prior works in terms of the trade-off between throughput, hardware resource efficiency, and model accuracy. KCI Citation Count: 0
AbstractList The deployment of compact convolutional neural network (CNN) models with skip connections on edge devices through dedicated hardware accelerators is increasingly prevalent. However, optimizing the use of limited on-chip memory (OCM) across multiple CNN layers, especially those with skip connections, remains a challenge. In this paper, we propose a novel CNN accelerator technique that reorders the computation sequence for each layer to maximize data reuse within the OCM, thereby minimizing DRAM access and improving the utilization of both the OCM and the convolution processor. Additionally, we introduce a shared buffer design that efficiently manages OCM usage across different layers, particularly those involving skip connections. Finally, we present a ResNet-18 accelerator IP, RADAR, implemented with the proposed technique on a Xilinx ZCU102 FPGA. RADAR achieves 64.9 GOPS/W and 446.9 GOPS while maintaining high accuracy, demonstrating significant improvements over prior works in terms of the trade-off between throughput, hardware resource efficiency, and model accuracy. KCI Citation Count: 0
Author Choi, Dahun
Kim, Hyun
Park, Juntae
Author_xml – sequence: 1
  givenname: Juntae
  surname: Park
  fullname: Park, Juntae
– sequence: 2
  givenname: Dahun
  surname: Choi
  fullname: Choi, Dahun
– sequence: 3
  givenname: Hyun
  surname: Kim
  fullname: Kim, Hyun
BackLink https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART003232511$$DAccess content in National Research Foundation of Korea (NRF)
BookMark eNotkFtLwzAAhYMouE3_gE95FjpzbVLfym5Oho6u7yFNk1k3G00qw39v64QDhwMf5-Ebg8vWtxaAO4ymnAv68Lwrd1OCCJ_2YVPG8QUYEUJpwmSaXoIR5plMcMrFNRjH-I5QKkUmRqAu8nlePMK8hQvnGtPYtoPL7SpPKh1tDQsbX2wHc2Ps0Qbd-QBPTfcG57rTiT7pYHvEh9qGpt1D7-A2eGNjHNbOfn3btl834MrpY7S3_z0B5XJRzp6SzetqPcs3iRESJxmpJccaSSelpbTShInKISskEw6lrjY1qqUwWBhZZRQzwTMueJUJbQThhE7A_fm2DU4dTKO8bv5679UhqLwo1wojwRgSuIfJGTbBxxisU5-h-dDhp0fU4FQNTtXgVPVhqndKfwEhlGp3
ContentType Journal Article
DBID AAYXX
CITATION
ACYCR
DOI 10.5573/JSTS.2025.25.4.451
DatabaseName CrossRef
Korean Citation Index
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2233-4866
EndPage 458
ExternalDocumentID oai_kci_go_kr_ARTI_10744071
10_5573_JSTS_2025_25_4_451
GroupedDBID 9ZL
AAYXX
ADDVE
AENEX
ALMA_UNASSIGNED_HOLDINGS
CITATION
DBRKI
FRP
GW5
HH5
JDI
OK1
TDB
TR2
ACYCR
C1A
KVFHK
MZR
ZZE
ID FETCH-LOGICAL-c781-92d851a08f88e33ba247bf0e7847f06fdcd0d87c17c8b9314759575b97ac72523
ISSN 1598-1657
IngestDate Sun Aug 24 03:18:59 EDT 2025
Wed Aug 27 16:28:55 EDT 2025
IsPeerReviewed false
IsScholarly true
Issue 4
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c781-92d851a08f88e33ba247bf0e7847f06fdcd0d87c17c8b9314759575b97ac72523
PageCount 8
ParticipantIDs nrf_kci_oai_kci_go_kr_ARTI_10744071
crossref_primary_10_5573_JSTS_2025_25_4_451
PublicationCentury 2000
PublicationDate 2025-08-31
PublicationDateYYYYMMDD 2025-08-31
PublicationDate_xml – month: 08
  year: 2025
  text: 2025-08-31
  day: 31
PublicationDecade 2020
PublicationTitle Journal of semiconductor technology and science
PublicationYear 2025
Publisher 대한전자공학회
Publisher_xml – name: 대한전자공학회
SSID ssj0068797
Score 2.3288388
Snippet The deployment of compact convolutional neural network (CNN) models with skip connections on edge devices through dedicated hardware accelerators is...
SourceID nrf
crossref
SourceType Open Website
Index Database
StartPage 451
SubjectTerms 전기공학
Title RADAR: An Efficient FPGA-based ResNet Accelerator with Data-aware Reordering of Processing Sequences
URI https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART003232511
Volume 25
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
ispartofPNX JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, 2025, 25(4), 124, pp.451-458
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3db9MwELe68QIPiE8xvmQJ8lQlpIkT27ylbUYBMU2lSHuznMTZqokUdakQ_GH8fdw5aZrBQAMpilw3qdy7n3x39t3PhLw0pR8XWYxlHzEEKIZxV5Yic7kWAvBkSm55Zj8cxbNP7N1JdDIY_OhlLW3qzMu_X1lX8j9ahT7QK1bJ_oNmux-FDmiDfuEOGob7tXQ8T6bJvF3aSy0XBO7sHx6_SVy0Tlh5eHFkajwSAqyL3VBvFl6nutau_opZX3Nj2Tfb5Oe2bsDSdG-TrP_gv15gWv2qQr5YzFTslujtdkRrWLv8nE1VazM87hUGTfXZphpOzlbLDlnfoOP98nN_ISKItiurLXScdOyICeZnpFNHRo6cOOnESXxHMGzI1JHwVeKMQ2ccbZ-RtpHAi_05WEJgGze81Z6xfeDEhC4TcX-uZS1TrWk_iassQhRxZKaAWe6jh4P24GJe92qffvsXs3iJgPs8X6rTlTpfKwgz3irMYsVIeI_cCCA8sQmlsy7sigVvDvXZ_pGmWAuH8ur3gVxyiPaqddnzbxZ3yO1WsTRpUHaXDEx1j9zq0VXeJ4XF22uaVLRDG92hjTZooz20UUQb3aGN7tBGVyXdoY12aHtAFofpYjJz21M63JyLkSuDApx27YtSCBOGmQ4Yz0rfcHB7YB4oi7zwC8HzEc9FJsMR8ktCiJBJrnMeREH4kOxXq8o8ItQUesSFZMJowZDnEYIPzUJdyhGeoZYdkOFWUupLw8UCmlAoV4VyVShXBRdTINcD8gKEaTX3Fw0-vtZTT8jNHeKfkv16vTHPwBmts-dW8z8B9gJ_ag
linkProvider ABC ChemistRy
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=RADAR%3A+An+Efficient+FPGA-based+ResNet+Accelerator+with+Data-aware+Reordering+of+Processing+Sequences&rft.jtitle=Journal+of+semiconductor+technology+and+science&rft.au=Juntae+Park&rft.au=Dahun+Choi&rft.au=Hyun+Kim&rft.date=2025-08-31&rft.pub=%EB%8C%80%ED%95%9C%EC%A0%84%EC%9E%90%EA%B3%B5%ED%95%99%ED%9A%8C&rft.issn=1598-1657&rft.eissn=2233-4866&rft.spage=451&rft.epage=458&rft_id=info:doi/10.5573%2FJSTS.2025.25.4.451&rft.externalDBID=n%2Fa&rft.externalDocID=oai_kci_go_kr_ARTI_10744071
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1598-1657&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1598-1657&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1598-1657&client=summon