Design of Intra Cluster Access Structure for Distributed Caches of Array Processor

Aiming at the requirements of high bandwidth and low latency for memory in reconfigurable computing, and the characteristics of high data parallelism, large access stock, and obvious temporal locality of reconfigurable array processor, an access structure is proposed. Based on the distributed Cache...

Full description

Saved in:
Bibliographic Details
Published inInternational Conference on Measuring Technology and Mechatronics Automation (Print) pp. 66 - 73
Main Authors Liu, You-Yao, Cai, Hui-Nan, Han, Si-Yi
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.01.2022
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Aiming at the requirements of high bandwidth and low latency for memory in reconfigurable computing, and the characteristics of high data parallelism, large access stock, and obvious temporal locality of reconfigurable array processor, an access structure is proposed. Based on the distributed Cache of reconfigurable array processors, this structure realizes the parallel cross memory access of PE in the cluster, solves the problem of serious shortage of storage bandwidth, improves the memory access speed in the cluster, and reduces the memory access power consumption. The FPGA development board is used to verify the prototype of the design. Under the conflict-free memory access, the maximum frequency of parallel read-write memory access of 4 * 4 PE arrays in the cluster reaches 220 MHz, and the peak bandwidth of memory access is 7.53 GB/s. Finally, the mapping of the Newton iterative detection algorithm is completed on this structure. The structure provides 329.44 MB/s data access bandwidth for the algorithm, and the running time is 0.38ms.
AbstractList Aiming at the requirements of high bandwidth and low latency for memory in reconfigurable computing, and the characteristics of high data parallelism, large access stock, and obvious temporal locality of reconfigurable array processor, an access structure is proposed. Based on the distributed Cache of reconfigurable array processors, this structure realizes the parallel cross memory access of PE in the cluster, solves the problem of serious shortage of storage bandwidth, improves the memory access speed in the cluster, and reduces the memory access power consumption. The FPGA development board is used to verify the prototype of the design. Under the conflict-free memory access, the maximum frequency of parallel read-write memory access of 4 * 4 PE arrays in the cluster reaches 220 MHz, and the peak bandwidth of memory access is 7.53 GB/s. Finally, the mapping of the Newton iterative detection algorithm is completed on this structure. The structure provides 329.44 MB/s data access bandwidth for the algorithm, and the running time is 0.38ms.
Author Cai, Hui-Nan
Han, Si-Yi
Liu, You-Yao
Author_xml – sequence: 1
  givenname: You-Yao
  surname: Liu
  fullname: Liu, You-Yao
  organization: Xi'an University of Posts & Telecommunications,School of Electronic Engineering,Xi'an,China,710121
– sequence: 2
  givenname: Hui-Nan
  surname: Cai
  fullname: Cai, Hui-Nan
  email: 97463795@qq.com
  organization: Xi'an University of Posts & Telecommunications,School of Electronic Engineering,Xi'an,China,710121
– sequence: 3
  givenname: Si-Yi
  surname: Han
  fullname: Han, Si-Yi
  organization: Xi'an University of Posts & Telecommunications,School of Electronic Engineering,Xi'an,China,710121
BookMark eNotT91KwzAYjaLgnHsCQfICnd-XtGlzWTp_ChuK9n6kyRetzFaS9mJvb4denQPnh3Ou2UU_9MTYHcIaEfR9Xe2aXZmlGuRagBBrABBwxlY6L1CpWTiRc7YQmOUJpgVesVWMX7NNYlHoXCzY24Zi99HzwfO6H4Ph1WGKIwVeWksx8vcxTHacAnE_BL7p4hi6dhrJ8crYT4qnYBmCOfLXMJwSQ7hhl94cIq3-ccmax4emek62L091VW6Tbl6SKGcLQJPL3JM2LSJ5mZItrDJeaUdSk3Zpi54sOmUh862TzmklWxCplXLJbv9qOyLa_4Tu24Tjfj6VAmj5C-VWVCg
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ICMTMA54903.2022.00020
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore Digital Library
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 9781665499781
1665499788
EISSN 2157-1481
EndPage 73
ExternalDocumentID 9724009
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  funderid: 10.13039/501100001809
GroupedDBID 6IE
6IF
6IK
6IL
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
OCL
RIE
RIL
ID FETCH-LOGICAL-i481-6dc801a737fe9ab11ef34ec8c6af69de39e9d4b1fec1d6c05fbd3dd963b024c33
IEDL.DBID RIE
IngestDate Wed Aug 27 02:49:15 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i481-6dc801a737fe9ab11ef34ec8c6af69de39e9d4b1fec1d6c05fbd3dd963b024c33
PageCount 8
ParticipantIDs ieee_primary_9724009
PublicationCentury 2000
PublicationDate 2022-Jan.
PublicationDateYYYYMMDD 2022-01-01
PublicationDate_xml – month: 01
  year: 2022
  text: 2022-Jan.
PublicationDecade 2020
PublicationTitle International Conference on Measuring Technology and Mechatronics Automation (Print)
PublicationTitleAbbrev ICMTMA
PublicationYear 2022
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0003188972
Score 1.7837663
Snippet Aiming at the requirements of high bandwidth and low latency for memory in reconfigurable computing, and the characteristics of high data parallelism, large...
SourceID ieee
SourceType Publisher
StartPage 66
SubjectTerms Array processor
Bandwidth
Distributed
Hardware
Iterative algorithms
Mapping
Memory management
Newton iterative method
Parallel processing
Reconfigurable computing
System-on-chip
Topology
Title Design of Intra Cluster Access Structure for Distributed Caches of Array Processor
URI https://ieeexplore.ieee.org/document/9724009
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT4MwFG62nfTij834Oz14lI2uUOhxYS6bCcboTHZbaPtIjGYYHAf9630tOBfjwRuBEKCP5ut7fd_3EXIV6ghTYzn0uMDpFgTSeBK48FikIcwMyMzJLqZ3YvoU3C7CRYtcb7gwAOCaz6BvD91evil0ZUtlAxnZjkfZJm1M3Gqu1qaegv9mjNcbEjDz5WCWpPN0hPmPzzEPHDphTmvrveWi4kBkskfS78fXvSMv_Wqt-vrzlzLjf99vn_R-6Hr0fgNEB6QFq0Oyu6U02CUPY9epQYuczmw5lyavlZVIoCNnmEgfnYxsVQLFRSwdWzVda4QFhiZW8fnd3jgqy-yDNsyCouyR-eRmnky9xk7Bew5i5gmjEY2yiEc5RkAxBjkPQMdaZLmQBrgEaQLFctDMCO2HuTLcGJygCnFcc35EOqtiBceEGh5JDDAurZRybiVC8QhCzXwtYs3YCenawVm-1YIZy2ZcTv8-fUZ2bHjqusY56eAHwwUi_VpduhB_AWN3qIs
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV1LTwIxEG4IHtSLDzC-7UGPC3S726UHDwQkII8YxYQb2bazidGA4RGDf8W_4o9zWlYkxiuJt80mm93ZmXYenfk-Qi5DHWFqLH2PC1xuQSCNJ4ELj0UawtiAjB3sYqcrGo_BbT_sZ8jHchYGAFzzGRTspTvLNyM9s6Wyooxsx6NMWyhbMH_DBG1y3ayhNq98v37Tqza8lEPAewrKzBNG4xYcRzxK8LWKMUh4ALqsRZwIaYBLkCZQLAHNjNClMFGGG4NWqdB5aVvtxP19A8OM0F8Mhy0LOLgYyvhB6dQxK8lis9rpdSqYcJU4Jp6-QwK1POIrtC3Oa9V3yOe3vItmlefCbKoK-v0XFOQ__SG7JP8zjkjvlo52j2RguE-2V5AUc-S-5jpR6CihTVuuptWXmYWAoBVHCEkfHEzubAwUg3Ras2jBlugLDK1aROuJfbAyHsdzmk5OjMZ50luHZAckOxwN4ZBQwyOJBoyho1KOjUUoHkGoWUmLsmbsiOSsLgavC0CQQaqG479vX5DNRq_THrSb3dYJ2bKWsajhnJIsCg9nGNVM1bmzLkoGa1beF2F1CHQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=International+Conference+on+Measuring+Technology+and+Mechatronics+Automation+%28Print%29&rft.atitle=Design+of+Intra+Cluster+Access+Structure+for+Distributed+Caches+of+Array+Processor&rft.au=Liu%2C+You-Yao&rft.au=Cai%2C+Hui-Nan&rft.au=Han%2C+Si-Yi&rft.date=2022-01-01&rft.pub=IEEE&rft.eissn=2157-1481&rft.spage=66&rft.epage=73&rft_id=info:doi/10.1109%2FICMTMA54903.2022.00020&rft.externalDocID=9724009