S. Goedecker, ROTATING A 3-DIMENSIONAL ARRAY IN AN OPTIMAL POSITION FOR VECTOR PROCESSING - CASE-STUDY FOR A 3-DIMENSIONAL FAST FOURIER-TRANSFORM, Computer physics communications, 76(3), 1993, pp. 294-300
We show, that a three-dimensional array of dimension n1, n2, n3 can be
rotated in such a way, that all the innermost loops have lengths, whi
ch are products of two dimensions, i.e. n1n2, n1n3, n2n3. This techniq
ue is then applied to rotate a parallelepiped of data in an optimal po
sition for Fourier transformations along the three axes. The resulting
three-dimensional FFT (fast Fourier transform) has then only inner lo
ops of length n1n2, n1n3, n2n3. This increased loop length results in
a significant reduction of the required CPU time on vector machines.