A new IEEE compliant floating-point rounding algorithm for computing the ro
unded product from a carry-save representation of the product is presented.
The new rounding algorithm is compared with the rounding algorithms of Yu
and Zyner [26] and of Quach et al. [17]. For each rounding algorithm, a log
ical description and a block diagram is given, the correctness is proven, a
nd the latency is analyzed. We conclude that the new rounding algorithm is
the fastest rounding algorithm, provided that an injection (which depends o
nly on the rounding mode and the sign) can be added in during the reduction
of the partial products into a carry-save encoded digit string. In double
precision format, the latency of the new rounding algorithm is 12 logic lev
els compared to 14 logic levels in the algorithm of Quach et al. and 16 log
ic levels in the algorithm of Yu and Zyner.