ITA
ENG

Efficient universal lossless data compression algorithms based on a greedysequential grammar transform - Part one: Without context models

Authors

Yang, EH Kieffer, JC

Citation

Eh. Yang et Jc. Kieffer, Efficient universal lossless data compression algorithms based on a greedysequential grammar transform - Part one: Without context models, IEEE INFO T, 46(3), 2000, pp. 755-777

Citations number

Categorie Soggetti

Information Tecnology & Communication Systems

Journal title

IEEE TRANSACTIONS ON INFORMATION THEORY

ISSN journal

00189448 → ACNP

Volume

Issue

Year of publication

2000

Pages

755 - 777

Database

ISI

SICI code

0018-9448(200005)46:3<755:EULDCA>2.0.ZU;2-E

Abstract

A grammar transform is a transformation that converts any data sequence to be compressed into a grammar from which the original data sequence can be f ully reconstructed. In a grammar-based code, a data sequence is first conve rted into a grammar by a grammar transform and then losslessly encoded. In this paper, a greedy grammar transform is first presented; this grammar tra nsform constructs sequentially a sequence of irreducible grammars from whic h the original data sequence can be recovered incrementally. Based on this grammar transform, three universal lossless data compression algorithms, a sequential algorithm, an improved sequential algorithm, and a hierarchical algorithm, are then developed. These algorithms combine the power of arithm etic coding with that of string matching. It is shown that these algorithms are all universal in the sense that they can achieve asymptotically the en tropy rate of any stationary, ergodic source. Moreover, it is proved that t heir worst case redundancies among all individual sequences of length n are upper-bounded by c log log n / log n, where c is a constant. Simulation re sults show that the proposed algorithms outperform the Unix Compress and Gz ip algorithms, which are based on LZ78 and LZ77, respectively.