SEMANTIC AND GENERATIVE MODELS FOR LOSSY TEXT COMPRESSION

Citation
Ih. Witten et al., SEMANTIC AND GENERATIVE MODELS FOR LOSSY TEXT COMPRESSION, Computer journal, 37(2), 1994, pp. 83-87
Citations number
15
Categorie Soggetti
Computer Sciences","Computer Science Hardware & Architecture
Journal title
ISSN journal
00104620
Volume
37
Issue
2
Year of publication
1994
Pages
83 - 87
Database
ISI
SICI code
0010-4620(1994)37:2<83:SAGMFL>2.0.ZU;2-E
Abstract
The complementary paradigms of text compression and image compression suggest that there may be potential for applying methods developed for one domain to the other. In image coding, lossy techniques yield comp ression factors that are vastly superior to those of the best lossless schemes and we show that this is also the case for text. This paper i nvestigates the resulting trade-off between subjective quality of the transmission and its compression factor. Two different methods are des cribed, which can be combined into an extremely effective technique th at provides far better compression than the present state of the art a nd yet preserves a reasonable degree of perceived match between the or iginal and received text. The major challenge for lossy text compressi on is the quantitative evaluation of the quality of this match.