In this paper, we present a novel scheme for performing fixed-point ar
ithmetic efficiently on fine-grain, massively parallel, programmable a
rchitectures including both custom and FPGA-based systems. We achieve
an O(n) speedup, where n is the operand precision, over the bit-serial
methods of existing fine-grain systems such as the DAP, the MPP and t
he CM2, within the constraints of regular, near neighbor communication
and only a small amount of on-chip memory. This is possible by means
of digit pipelined algorithms which avoid broadcast and which operate
in a fully systolic manner by pipelining at the digit level. A base 4,
signed-digit, fully redundant number system and on-line techniques ar
e used to limit carry propagation and minimize communication costs. Al
though our algorithms are digit-serial, we are able to match the perfo
rmance of the bit-parallel methods, while retaining low communication
complexity. Reconfigurable hardware systems built using field programm
able gate arrays (FPGA's) can share in the speed benefits of these alg
orithms. By using the organization of logic blocks suggested in this p
aper, problems of placement and routing that exist in such systems can
be avoided. Since the algorithms are amenable to pipelining, very hig
h throughput can be obtained.