Automatic inference of models for statistical code compression

Authors
Citation
Cw. Fraser, Automatic inference of models for statistical code compression, ACM SIGPL N, 34(5), 1999, pp. 242-246
Citations number
18
Categorie Soggetti
Computer Science & Engineering
Journal title
ACM SIGPLAN NOTICES
ISSN journal
15232867 → ACNP
Volume
34
Issue
5
Year of publication
1999
Pages
242 - 246
Database
ISI
SICI code
1523-2867(199905)34:5<242:AIOMFS>2.0.ZU;2-U
Abstract
This paper describes experiments that apply machine learning to compress co mputer programs, formalizing and automating decisions about instruction enc oding that have traditionally been made by humans in a more ad hoc manner. A program accepts a large training set of program material in a conventiona l compiler intermediate representation (IR) and automatically infers a deci sion tree that separates IR code into streams that compress much better tha n the undifferentiated whole. Driving a conventional arithmetic compressor with this model yields code 30% smaller than the previous record for IR cod e compression, and 24% smaller than an ambitious optimizing compiler feedin g an ambitious general-purpose data compressor.