This paper introduces a new data compression algorithm. The goal underlying
this new code design is to achieve a single lossless lossless compression
algorithm with the excellent compensation ratios of the Prediction by Parti
al Mapping (PPM) algorithms and the low complexity of codes based on the Bu
rrows Wheeler Transform (BWT). Like the BWT-based codes, the proposed algor
ithm requires worst case O(n) computational complexity and memory; in contr
ast, the unbounded-context PPM algorithm, called PPM*, requires worst case
O(n(2)) computational complexity. Like PPM*, the proposed algorthm allows t
he use of unbounded contexts. Using standard data sets for comparison, the
proposed algorithm achieves compression performance better than proposed al
gorithm achieves compression performance better than that of the proposed a
lgorithm yields an average rate of 2.29 bits per character (bpc) on the Cal
gary corpus; this results compares favorably with the 2.33 and 2.34 bpc of
PPM5 and PPM* (PPM algorithms), the 2.43 bpc of BW94 (the original BWT-base
d code), and the 3.64 and 2.69 bpc of compress and gzip (popular Unix compr
ession algorthms based on Lempel-Ziv (LZ) coding techniques) on the same da
ta set. The given code does not, however match the best reported compressio
n performance-2.12 bpc with PPMZ9-listed on the Calgary corpus results web
page at the time of this publication. Results on the Canterbury corpus give
a similar relative standing. The proposed algorithm gives an average rate
of 2.15 bpc on the Canterbury corpus, while the Canterbury corpus web page
gives an average rates of 1.99 bpc for PPMZ9, 2.11 bpc for PPM5, 2.15 bpc f
or PPM7, 2.23 bpc for BZIP2 (a popular BWT-based code), and 3.31 and 2.53 b
pc for compress and gzip repectively.