A Novel Fully Hardware-Implemented SVD Solver Based on Ultra-Parallel BCV Jacobi Algorithm

Efficient FPGA-based floating-point singular value decomposition (SVD) is challenging for its enormous complexity with the rapid growth of the matrix dimension. Numerous hardware architectures have been proposed to improve the performance of SVD by increasing capacity of computation units, reusing d...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on circuits and systems. II, Express briefs Vol. 69; no. 12; pp. 5114 - 5118
Main Authors Hu, Tang, Li, Xiangdi, Yu, Xiao, Ren, Songnan, Yan, Li, Bai, Xuyang, Xu, Zhiwei, Zhu, Shiqiang
Format Journal Article
LanguageEnglish
Published New York IEEE 01.12.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Efficient FPGA-based floating-point singular value decomposition (SVD) is challenging for its enormous complexity with the rapid growth of the matrix dimension. Numerous hardware architectures have been proposed to improve the performance of SVD by increasing capacity of computation units, reusing data, and enhancing bandwidth. These designs, however, are not optimum due to their low parallelism, poor data access efficiency, and inferior iterations scheduling. In this express, we propose a block column vector Hestenes-Jacobi (BCV Jacobi) algorithm that decomposes an arbitrary large matrix into several blocks, enhances the access efficiency by customizing the distinctive data structure, and improves the system-level parallelism by simplifying the iteration scheduling. The proposed BCV Jacobi algorithm also achieves better scalability and efficiency. Experimental results show that the performance of the proposed FPGA based SVD processor is superior to other SVD implementations in terms of parallelism, data access efficiency, matrix size, and execution time. When compared with state of the art SVD accelerator engine, the proposed algorithm speeds up the runtime over <inline-formula> <tex-math notation="LaTeX">2{\times } </tex-math></inline-formula> on average.
ISSN:1549-7747
1558-3791
DOI:10.1109/TCSII.2022.3200750