Design of Intra Cluster Access Structure for Distributed Caches of Array Processor

Aiming at the requirements of high bandwidth and low latency for memory in reconfigurable computing, and the characteristics of high data parallelism, large access stock, and obvious temporal locality of reconfigurable array processor, an access structure is proposed. Based on the distributed Cache...

Full description

Saved in:

Bibliographic Details
Published in	International Conference on Measuring Technology and Mechatronics Automation (Print) pp. 66 - 73
Main Authors	Liu, You-Yao, Cai, Hui-Nan, Han, Si-Yi
Format	Conference Proceeding
Language	English
Published	IEEE 01.01.2022
Subjects	Array processor Bandwidth Distributed Hardware Iterative algorithms Mapping Memory management Newton iterative method Parallel processing Reconfigurable computing System-on-chip Topology
Online Access	Get full text

Cover

Loading…

Abstract	Aiming at the requirements of high bandwidth and low latency for memory in reconfigurable computing, and the characteristics of high data parallelism, large access stock, and obvious temporal locality of reconfigurable array processor, an access structure is proposed. Based on the distributed Cache of reconfigurable array processors, this structure realizes the parallel cross memory access of PE in the cluster, solves the problem of serious shortage of storage bandwidth, improves the memory access speed in the cluster, and reduces the memory access power consumption. The FPGA development board is used to verify the prototype of the design. Under the conflict-free memory access, the maximum frequency of parallel read-write memory access of 4 * 4 PE arrays in the cluster reaches 220 MHz, and the peak bandwidth of memory access is 7.53 GB/s. Finally, the mapping of the Newton iterative detection algorithm is completed on this structure. The structure provides 329.44 MB/s data access bandwidth for the algorithm, and the running time is 0.38ms.
AbstractList	Aiming at the requirements of high bandwidth and low latency for memory in reconfigurable computing, and the characteristics of high data parallelism, large access stock, and obvious temporal locality of reconfigurable array processor, an access structure is proposed. Based on the distributed Cache of reconfigurable array processors, this structure realizes the parallel cross memory access of PE in the cluster, solves the problem of serious shortage of storage bandwidth, improves the memory access speed in the cluster, and reduces the memory access power consumption. The FPGA development board is used to verify the prototype of the design. Under the conflict-free memory access, the maximum frequency of parallel read-write memory access of 4 * 4 PE arrays in the cluster reaches 220 MHz, and the peak bandwidth of memory access is 7.53 GB/s. Finally, the mapping of the Newton iterative detection algorithm is completed on this structure. The structure provides 329.44 MB/s data access bandwidth for the algorithm, and the running time is 0.38ms.
Author	Cai, Hui-Nan Han, Si-Yi Liu, You-Yao
Author_xml	– sequence: 1 givenname: You-Yao surname: Liu fullname: Liu, You-Yao organization: Xi'an University of Posts & Telecommunications,School of Electronic Engineering,Xi'an,China,710121 – sequence: 2 givenname: Hui-Nan surname: Cai fullname: Cai, Hui-Nan email: 97463795@qq.com organization: Xi'an University of Posts & Telecommunications,School of Electronic Engineering,Xi'an,China,710121 – sequence: 3 givenname: Si-Yi surname: Han fullname: Han, Si-Yi organization: Xi'an University of Posts & Telecommunications,School of Electronic Engineering,Xi'an,China,710121
BookMark	eNotT91KwzAYjaLgnHsCQfICnd-XtGlzWTp_ChuK9n6kyRetzFaS9mJvb4denQPnh3Ou2UU_9MTYHcIaEfR9Xe2aXZmlGuRagBBrABBwxlY6L1CpWTiRc7YQmOUJpgVesVWMX7NNYlHoXCzY24Zi99HzwfO6H4Ph1WGKIwVeWksx8vcxTHacAnE_BL7p4hi6dhrJ8crYT4qnYBmCOfLXMJwSQ7hhl94cIq3-ccmax4emek62L091VW6Tbl6SKGcLQJPL3JM2LSJ5mZItrDJeaUdSk3Zpi54sOmUh862TzmklWxCplXLJbv9qOyLa_4Tu24Tjfj6VAmj5C-VWVCg
CODEN	IEEPAD
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/ICMTMA54903.2022.00020
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Xplore Digital Library url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISBN	9781665499781 1665499788
EISSN	2157-1481
EndPage	73
ExternalDocumentID	9724009
Genre	orig-research
GrantInformation_xml	– fundername: National Natural Science Foundation of China funderid: 10.13039/501100001809
GroupedDBID	6IE 6IF 6IK 6IL 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI OCL RIE RIL
ID	FETCH-LOGICAL-i481-6dc801a737fe9ab11ef34ec8c6af69de39e9d4b1fec1d6c05fbd3dd963b024c33
IEDL.DBID	RIE
IngestDate	Wed Aug 27 02:49:15 EDT 2025
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i481-6dc801a737fe9ab11ef34ec8c6af69de39e9d4b1fec1d6c05fbd3dd963b024c33
PageCount	8
ParticipantIDs	ieee_primary_9724009
PublicationCentury	2000
PublicationDate	2022-Jan.
PublicationDateYYYYMMDD	2022-01-01
PublicationDate_xml	– month: 01 year: 2022 text: 2022-Jan.
PublicationDecade	2020
PublicationTitle	International Conference on Measuring Technology and Mechatronics Automation (Print)
PublicationTitleAbbrev	ICMTMA
PublicationYear	2022
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0003188972
Score	1.7837663
Snippet	Aiming at the requirements of high bandwidth and low latency for memory in reconfigurable computing, and the characteristics of high data parallelism, large...
SourceID	ieee
SourceType	Publisher
StartPage	66
SubjectTerms	Array processor Bandwidth Distributed Hardware Iterative algorithms Mapping Memory management Newton iterative method Parallel processing Reconfigurable computing System-on-chip Topology
Title	Design of Intra Cluster Access Structure for Distributed Caches of Array Processor
URI	https://ieeexplore.ieee.org/document/9724009
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT4MwFG62nfTij834Oz14lI2uUOhxYS6bCcboTHZbaPtIjGYYHAf9630tOBfjwRuBEKCP5ut7fd_3EXIV6ghTYzn0uMDpFgTSeBK48FikIcwMyMzJLqZ3YvoU3C7CRYtcb7gwAOCaz6BvD91evil0ZUtlAxnZjkfZJm1M3Gqu1qaegv9mjNcbEjDz5WCWpPN0hPmPzzEPHDphTmvrveWi4kBkskfS78fXvSMv_Wqt-vrzlzLjf99vn_R-6Hr0fgNEB6QFq0Oyu6U02CUPY9epQYuczmw5lyavlZVIoCNnmEgfnYxsVQLFRSwdWzVda4QFhiZW8fnd3jgqy-yDNsyCouyR-eRmnky9xk7Bew5i5gmjEY2yiEc5RkAxBjkPQMdaZLmQBrgEaQLFctDMCO2HuTLcGJygCnFcc35EOqtiBceEGh5JDDAurZRybiVC8QhCzXwtYs3YCenawVm-1YIZy2ZcTv8-fUZ2bHjqusY56eAHwwUi_VpduhB_AWN3qIs
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV1LTwIxEG4IHtSLDzC-7UGPC3S726UHDwQkII8YxYQb2bazidGA4RGDf8W_4o9zWlYkxiuJt80mm93ZmXYenfk-Qi5DHWFqLH2PC1xuQSCNJ4ELj0UawtiAjB3sYqcrGo_BbT_sZ8jHchYGAFzzGRTspTvLNyM9s6Wyooxsx6NMWyhbMH_DBG1y3ayhNq98v37Tqza8lEPAewrKzBNG4xYcRzxK8LWKMUh4ALqsRZwIaYBLkCZQLAHNjNClMFGGG4NWqdB5aVvtxP19A8OM0F8Mhy0LOLgYyvhB6dQxK8lis9rpdSqYcJU4Jp6-QwK1POIrtC3Oa9V3yOe3vItmlefCbKoK-v0XFOQ__SG7JP8zjkjvlo52j2RguE-2V5AUc-S-5jpR6CihTVuuptWXmYWAoBVHCEkfHEzubAwUg3Ras2jBlugLDK1aROuJfbAyHsdzmk5OjMZ50luHZAckOxwN4ZBQwyOJBoyho1KOjUUoHkGoWUmLsmbsiOSsLgavC0CQQaqG479vX5DNRq_THrSb3dYJ2bKWsajhnJIsCg9nGNVM1bmzLkoGa1beF2F1CHQ
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=International+Conference+on+Measuring+Technology+and+Mechatronics+Automation+%28Print%29&rft.atitle=Design+of+Intra+Cluster+Access+Structure+for+Distributed+Caches+of+Array+Processor&rft.au=Liu%2C+You-Yao&rft.au=Cai%2C+Hui-Nan&rft.au=Han%2C+Si-Yi&rft.date=2022-01-01&rft.pub=IEEE&rft.eissn=2157-1481&rft.spage=66&rft.epage=73&rft_id=info:doi/10.1109%2FICMTMA54903.2022.00020&rft.externalDocID=9724009