# HIERARCHICAL Integer Multiplier Unit ### Details

Category: Arithmetic Core

Created: July 10, 2003

Updated: January 27, 2020

### Other project properties

Development Status: Stable

WishBone compliant: No

WishBone version: n/a

## Description

Operation of multiplication is very important in microelectronics. Each modern microprocessor has this operation within its instruction set, and advanced microprocessors have special multiplication units, that perform multiplication during 1 synchronization period(cycle). Especially valuable multiplication is in DSP processors, where it is practically main operation. Performance of any DSP processor is defined with delays in it MAC (multiply and accumulate) unit. So efficiency of multiplication is very important.

Methodology Overview.
The idea of algorithms is as follows. Unsigned multiplicands A and D may be represented in following form: A*D = (B * 2n + ó) * (E * 2n + F), where n – any number that is satisfied with following conditions:

1. 2n < á;
2. 2n < D;
3. ó < 2n;
4. F < 2n.

«Hierarchical» algorithm.
As it follows from theory of algorithms maximum of timing efficiency should be expected when dimensions of operands B, C, E and F (see basic formula) are equal at every algorithm call, i.e. n=m/2. In this case number of recursions will be minimal and number of sums that take part in final result also will be minimal.

Modified «hierarchical» algorithm.
This algorithm is an attempt to improve “hierarchical” algorithm for long-dimensional operands by substitution of one multiplication with some of addition operations. But for dimensions commonly used (8 - 64 bit) the result was not as expected. Algorithm advantages supposed to appear on m → [128..∞) where possibly the algorithm may be preferable than the prototype.

## Features

"Hierarchical" integer multiplication unit characteristics
The algorithm was written in VHDL, synthesized within Synopsys Design Compiler on 0.35u CMOS library. The data of the allocation areas are given only for a combinational part of algorithm.

 Operands Width Delay(ns) Gates allocated 8 9.56 760 16 15.15 2505 32 23.12 9355 64 35.43 33805

,
,
,

"Optimized Hierarchical" multiplication IP core characteristics The algorithm was written in VHDL, synthesized within Synopsys Design Compiler on 0.35u CMOS library. The data of the allocation areas are given only for a combinational part of algorithms.

 Operands Width Delay(ns) Gates allocated 8 14.28 1015 16 21.76 3585 32 33.85 11240 64 56.48 30368

, 