Floating-Point Computations on Reconfigurable Computers

Modern reconfigurable computers combine general-purpose processors with field programmable gate arrays (FPGAs). The FPGAs are, in effect, reconfigurable application-specific coprocessors. During one run, the FPGA might be a matrix-vector multiply coprocessor; during another run, it might be a linear...

Full description

Saved in:

Bibliographic Details
Published in	2007 DoD High Performance Computing Modernization Program Users Group Conference pp. 339 - 344
Main Author	Morris, G.R.
Format	Conference Proceeding
Language	English
Published	IEEE 01.06.2007
Subjects	Algorithm design and analysis Computer architecture Coprocessors Equations Field programmable gate arrays Kernel Parallel processing Partitioning algorithms Pipeline processing Sparse matrices
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Modern reconfigurable computers combine general-purpose processors with field programmable gate arrays (FPGAs). The FPGAs are, in effect, reconfigurable application-specific coprocessors. During one run, the FPGA might be a matrix-vector multiply coprocessor; during another run, it might be a linear equation solver. There are several issues associated with the mapping of floating-point computations onto RCs. There is the determination of what the author terms "the FPGA design boundary," i.e., the portion of the application that is mapped onto the FPGA. Furthermore, FPGA-based kernel performance is heavily dependent upon both pipelining and parallelism. The author has coined the phrase "the three p's" to encapsulate this important relationship. In this paper, important FPGA design boundary heuristics are described, and a toroidal architecture and partitioned loop algorithm are used to maximize both pipelining and parallelism for a double- precision floating-point sparse matrix conjugate gradient solver that is mapped onto a reconfigurable computer. Wall clock run time comparisons show that the FPGA- augmented version runs more than two times faster than the software-only version.
DOI:	10.1109/HPCMP-UGC.2007.35