An optimized design of delay-and energy-efficient Booth multiplier

•Addressed the high latency issue inherent in generating partial products for signed numbers using eight NOT logic gates (inverters) and a single sign selector unit for two's complement logic selection.•Optimized the booth encoding multiplexers, which are responsible for generating partial prod...

Full description

Saved in:

Bibliographic Details
Published in	e-Prime Vol. 9; p. 100698
Main Authors	Rafiq, Ahsan, Jenihhin, Maksim
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.09.2024 Elsevier
Subjects	Booth multiplier Carry-select adder Compressors Low-power designs Parallel architectures Carry-select adder Compressors Booth multiplier Parallel architectures Low-power designs
Online Access	Get full text

Cover

Loading…

More Information
Summary:	•Addressed the high latency issue inherent in generating partial products for signed numbers using eight NOT logic gates (inverters) and a single sign selector unit for two's complement logic selection.•Optimized the booth encoding multiplexers, which are responsible for generating partial products according to the pertinent encoding logic by streamlining the design and logic gates reduction.•Efficient compressors are proposed with two XOR logic gates delay and efficiently handling the two's complement selection logic without introducing extra delay or area overhead.•An optimized design of adder segment for final summation stage is proposed for fast throughput with minimal fan-in logic gates. Multipliers are essential computation units in virtually all computing systems, including processors and numerous AI accelerator architectures. This paper presents an optimized architecture for a Booth multiplier, targeting high performance while minimizing energy consumption and area utilization. The design optimization focuses on all three multiplier stages: partial product generation, reduction, and summation. To enhance delay and energy efficiency in the partial product generation stage, we first employed a simplified configuration comprising inverters and a sign selection unit instead of complex binary-to-two's complement circuitry. Next, to achieve further delay and area efficiency at this stage, logic optimization is applied at the partial product's generation circuitry by designing Booth encoders to remove redundant logic in multiplexers circuitry. Moreover, we introduced specialized sign compressors tailored for carry-save compression in the compression stage. Compared to conventional counterparts, these compressors offered lower power consumption and reduced critical path delay with only two XOR logic gates. Finally, in the summation stage, we proposed an optimized design segment for Carry Look-Ahead Adder for the final summation stage, designed to deliver swift throughput with minimal fan-in logic gates, even in the context of high bit-width configurations. This segment is cascaded to make a 13-bit final adder for the summation stage in the proposed design. The proposed architecture undergoes ASIC-targeted synthesis in Cadence Genus employing FreePDK CMOS 45 nm process technology. Synthesized results, along with theoretical design complexity comparison, demonstrate that the proposed design surpasses state-of-the-art 8 × 8 multiplier designs by critical metrics, including delay, power consumption, area utilization, power delay product, and area delay product.
ISSN:	2772-6711 2772-6711
DOI:	10.1016/j.prime.2024.100698