A highly regular parallel multiplier architecture along with the novel low-
power, highperformance CMOS implementation circuits is presented. The super
iority is achieved through utilizing a unique scheme for recursive decompos
ition of partial product matrices and a recently proposed non-binary arithm
etic logic as well as the complementary shift switch logic circuits.
The proposed 64 x 64-b parallel multiplier possesses the following distinct
features: (1) generating 64 8x8-b partial product matrices instead of a si
ngle large one; (2) comprising only four stages of bit reductions: first, b
y 8 x 8-b small parallel multipliers, then, by small parallel counters in e
ach of the remaining three stages. A family of shift switch parallel counte
rs, including non-binary (6,3)* and complementary (k,2) for 2 I k I 8, are
proposed for the efficient bit reductions; (3) using a simple final adder.
The non-binary logic operates 4-bit state signals (representing integers ra
nging from (0 to 3), where no more than half of the signal bits are subject
to value-change at any logic stage. This and others including minimum tran
sistor counts, fewer inverters, and low-leakage logic structure, significan
tly reduce circuit power dissipation.