IGTREE - USING TREES FOR COMPRESSION AND CLASSIFICATION IN LAZY LEARNING ALGORITHMS

Citation
W. Daelemans et al., IGTREE - USING TREES FOR COMPRESSION AND CLASSIFICATION IN LAZY LEARNING ALGORITHMS, Artificial intelligence review, 11(1-5), 1997, pp. 407-423
Citations number
24
Categorie Soggetti
Computer Sciences, Special Topics","Computer Science Artificial Intelligence
ISSN journal
02692821
Volume
11
Issue
1-5
Year of publication
1997
Pages
407 - 423
Database
ISI
SICI code
0269-2821(1997)11:1-5<407:I-UTFC>2.0.ZU;2-5
Abstract
We describe the IGTree learning algorithm, which compresses an instanc e base into a tree structure. The concept of information gain is used as a heuristic function for performing this compression. IGTree produc es trees that, compared to other lazy learning approaches, reduce stor age requirements and the time required to compute classifications. Fur thermore, we obtained similar or better generalization accuracy with I GTree when trained on two complex linguistic tasks, viz. letter-phonem e transliteration and part-of-speech-tagging, when compared to alterna tive lazy learning and decision tree approaches (viz., IB1, informatio n-gain-weighted IB1, and C4.5). A third experiment, with the task of w ord hyphenation, demonstrates that when the mutual differences in info rmation gain of features is too small, IGTree as well as information-g ain-weighted IB1 perform worse than IB1. These results indicate that I GTree is a useful algorithm for problems characterized by the availabi lity of a large number of training instances described by symbolic fea tures with sufficiently differing information gain values.