In this paper, tao new system architectures, overlap-state sequential and s
plit-and-merge parallel, are proposed based on a novel boundary postprocess
ing technique for the computation of the discrete wavelet transform (DWT).
The basic idea is to introduce multilevel partial computations for samples
near data boundaries based on a finite state machine model of the DWT deriv
ed from the lifting scheme. The key observation is that these partially com
puted (lifted) results can also be stored back to their original locations
and the transform can be continued anytime later as long as these partial c
omputed results are preserved. It is shown that such an extension of the in
-place calculation feature of the original lifting algorithm greatly helps
to reduce the extra buffer and communication overheads, in sequential and p
arallel system implementations, respectively. Performance analysis and expe
rimental results show that, for the Daubechies (9,7) wavelet filters, using
the proposed boundary postprocessing technique, the minimal required buffe
r size in the line-based sequential DWT algorithm [1] is 40% less than the
best available approach. In the parallel DWT algorithm me show 30% faster p
erformance than existing approaches.