Exact Fused Dot Product Add Operators
This article explores architectures of exact (correctly rounded) fused dot product and add operators suitable for the FP32 and FP64 binary floating-point representations with subnormal support, and other representations with a wide dynamic range such as bfloat16. The exact summation of terms before...
Saved in:
Published in | 2023 IEEE 30th Symposium on Computer Arithmetic (ARITH) pp. 151 - 158 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
04.09.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | This article explores architectures of exact (correctly rounded) fused dot product and add operators suitable for the FP32 and FP64 binary floating-point representations with subnormal support, and other representations with a wide dynamic range such as bfloat16. The exact summation of terms before rounding requires a full-size accumulator, and this work discusses techniques to compress the identical bits of this accumulator. This requires the computation of the relative shift amounts of the terms, which is formulated as a parallel prefix algorithm, allowing for a low-latency implementation. Architectural options for the exact fused dot product and add operators with up to 16 products for FP32, FP64 and mixed-precision BF16 to FP32 are evaluated using the TSMC 16FFC technology node. |
---|---|
ISSN: | 2576-2265 |
DOI: | 10.1109/ARITH58626.2023.00016 |