We demonstrate an approach to parallel programming, based on skeletons
- parameterized program schemas with efficient implementations over d
iverse architectures. The contribution of the paper is two-fold: (1)we
classify divide-and-conquer (DC) algorithms and provide a family of p
rovably correct parallel implementations for a particular DC skeleton,
called DH (distributable homomorphism); (2) we adjust the mathematica
l specification of the Fast Fourier Transform (FFT) to the DH skeleton
and, thereby, obtain a generic SPMD program, well suited for implemen
tation under MPI. The generic program includes the efficient FFT solut
ions used in practice - the binary-exchange and the 2D- and 3D-transpo
se implementations - as special cases.