Ch. Chang et al., Efficient VLSI architectures for fast computation of the discrete fourier transform and its inverse, IEEE SIGNAL, 48(11), 2000, pp. 3206-3216
In this paper, we propose two new VLSI architectures for computing the N-po
int discrete Fourier transform (DFT) and its inverse (IDFT) based on a radi
x-2 fast algorithm, where N is a power of two. The first part of this work
presents a linear systolic array that requires log(2) N complex multipliers
and is able to provide a throughput of one transform sample per clock cycl
e, Compared with other related systolic designs based on direct computation
or a radix-2 fast algorithm, the proposed one has the same throughput perf
ormance but involves less hardware complexity, This design is suitable for
high-speed real-time applications, but it would not be easily realized in a
single chip when N gets large. To balance the chip area and the processing
speed, we further present a new reduced-complexity design for the DFT/IDFT
computation. The alternative design is a memory-based architecture that co
nsists of one complex multiplier, two complex adders, and some special memo
ry units, The new design has the capability of computing one transform samp
le every log(2) N + 1 clock cycles on average. In comparison with the first
design, the second design reaches a lower throughput with less hardware co
mplexity. As N = 512, the chip area required for the memory-based design is
about 5742 x 5222 mum(2), and the corresponding throughput can attain a ra
te as high as 4M transform samples per second under 0.6 mum CMOS technology
, Such area-time performance makes this design very competitive for use in
long-length DFT applications, such as asymmetric digital subscriber lines (
ADSL) and orthogonal frequency-division multiplexing (OFDM) systems.