S. Goedecker, FAST RADIX-2, RADIX-3, RADIX-4, AND RADIX-5 KERNELS FOR FAST FOURIER TRANSFORMATIONS ON COMPUTERS WITH OVERLAPPING MULTIPLY-ADD INSTRUCTIONS, SIAM journal on scientific computing, 18(6), 1997, pp. 1605-1611
We present a new formulation of fast Fourier transformation (FFT) kern
els for radix 2, 3, 4, and 5, which have a perfect balance of multipli
es and adds. These kernels give higher performance on machines that ha
ve a single multiply-add (mult-add) instruction. We demonstrate the su
periority of this new kernel on IBM and SGI workstations.