Universal lossless compression via multilevel pattern matching

Citation
Jc. Kieffer et al., Universal lossless compression via multilevel pattern matching, IEEE INFO T, 46(4), 2000, pp. 1227-1245
Citations number
17
Categorie Soggetti
Information Tecnology & Communication Systems
Journal title
IEEE TRANSACTIONS ON INFORMATION THEORY
ISSN journal
00189448 → ACNP
Volume
46
Issue
4
Year of publication
2000
Pages
1227 - 1245
Database
ISI
SICI code
0018-9448(200007)46:4<1227:ULCVMP>2.0.ZU;2-D
Abstract
A universal lossless data compression code called the multilevel pattern ma tching code (MPM code) is introduced. Ln processing a finite-alphabet data string of length n, the MPM code operates at O(log log n) levels sequential ly. At each level, the MPM code detects matching patterns in the input data string (substrings of the data appearing in two or more nonoverlapping pos itions). The matching patterns detected at each level are of a fixed length which decreases by a constant factor from level to level, until this fixed length becomes one at the final level. The MPM code represents information about the matching patterns at each level as a string of tokens, with each token string encoded by an arithmetic encoder. From the concatenated encod ed token strings, the decoder can reconstruct the data string via several r ounds of parallel substitutions. A O(1/log n) maximal redundancy/sample upp er bound is established for the MPM code with respect to any class of finit e state sources of uniformly bounded complexity. We also show that the MPM code is of linear complexity in terms of time and space requirements. The r esults of some MPM code compression experiments are reported.