Performance results are presented for tbe design and implementation of para
llel pipelined space-time adaptive processing (STAP) algorithms on parallel
computers. In particular, the issues involved in parallelization, our appr
oach to parallelization, and performance results on an Intel Paragon are de
scribed. The process of developing software for such an application on para
llel computers when latency and throughput are both considered together is
discussed and tradeoffs considered with respect to inter and intratask comm
unication and data redistribution are presented. The results show that not
only scalable performance was achieved for individual component tasks of ST
AP but linear speedups were obtained for the integrated task performance, b
oth for latency as well as throughput. Results are presented for up to 236
compute nodes (limited by the machine size available to us). Another intere
sting observation made from the implementation results is that performance
improvement due to the assignment of additional processors to one task can
improve the performance of other tasks without any increase in the number o
f processors assigned to them. Normally, this cannot be predicted by theore
tical analysis.