TEXTUAL IMAGE COMPRESSION - 2-STAGE LOSSY LOSSLESS ENCODING OF TEXTUAL IMAGES

Citation
Ih. Witten et al., TEXTUAL IMAGE COMPRESSION - 2-STAGE LOSSY LOSSLESS ENCODING OF TEXTUAL IMAGES, Proceedings of the IEEE, 82(6), 1994, pp. 878-888
Citations number
17
Categorie Soggetti
Engineering, Eletrical & Electronic
Journal title
ISSN journal
00189219
Volume
82
Issue
6
Year of publication
1994
Pages
878 - 888
Database
ISI
SICI code
0018-9219(1994)82:6<878:TIC-2L>2.0.ZU;2-K
Abstract
A two-stage method for compressing bilevel images is described that is particularly effective for images containing repeated subimages, nota bly text. In the first stage, connected groups of pixels, correspondin g approximately to individual characters, are extracted from the image . These are matched against an adaptively constructed library of Patte rns seen so far, and the resulting sequence of symbol identification n umbers is coded and transmitted. From this information, along with the library itself and the offsets from one mark to the next, an approxim ate image can be reconstructed. The result is a lossy method of compre ssion that outperforms other schemes. The second stage employs the rec onstructed image as an aid for encoding the original image using a sta tistical context-based compression technique. This yields a total band width for exact transmission appreciably undercutting that required by other lossless binary image compression methods. Taken together, the lossy and lossless methods provide an effective two-stage progressive transmission capability for textual images which has application for l egal, medical, and historical purposes, and to archiving in general.