This paper presents a hardware efficient design for the discrete Fourier tr
ansform (DFT), The proposed design not only applies the constant property,
but also exploits the numerical property of the transform coefficients. DFT
is first formulated as cyclic convolution form to make each DFT output sam
ple computations have the same computation kernels. Then, by exploring the
symmetries of DFT coefficients, the word-level hardware sharing can be appl
ied, in which two times the throughput is obtained. Finally, bit-level comm
on subexpression sharing can be efficiently applied to implement the comple
x constant multiplications by using only shift operations and additions. Th
ough the three techniques have been proposed separately for transform, this
paper integrates the above techniques and obtains additive improvements. T
he I/O channels in our design are limited to the two extreme ends of the ar
chitecture that results in low I/O bandwidth. Compared with the previous me
mory-based design, the presented approach can save 80% of gate area with tw
o-times faster throughput for length N = 61. The presented approach can als
o be applied to power-of-two length DFT Similar efficient designs can be ob
tained for other transforms like DCT by applying the proposed approach.