Carry-save-adder (CSA) is one of the most widely used components for fast a
rithmetic in industry. This paper provides a solution to the problem of fin
ding an optimal-timing allocation of CSAs in arithmetic circuits. Namely, w
e present a polynomial time algorithm which finds an optimal-timing CSA all
ocation for a given arithmetic expression. We then extend our result for CS
A allocation to the problem of optimizing arithmetic expressions across the
boundary of design hierarchy by introducing a new concept, called auxiliar
y ports. Our algorithm can be used to carry out the CSA allocation step opt
imally and automatically and this can be done within the context of a stand
ard RTL synthesis environment.