ITA
ENG

IGTREE - USING TREES FOR COMPRESSION AND CLASSIFICATION IN LAZY LEARNING ALGORITHMS

Authors

DAELEMANS W VANDENBOSCH A WEIJTERS T

Citation

W. Daelemans et al., IGTREE - USING TREES FOR COMPRESSION AND CLASSIFICATION IN LAZY LEARNING ALGORITHMS, Artificial intelligence review, 11(1-5), 1997, pp. 407-423

Citations number

Categorie Soggetti

Computer Sciences, Special Topics","Computer Science Artificial Intelligence

Journal title

Artificial intelligence review → ACNP

ISSN journal

02692821

Volume

Issue

1-5

Year of publication

1997

Pages

407 - 423

Database

ISI

SICI code

0269-2821(1997)11:1-5<407:I-UTFC>2.0.ZU;2-5

Abstract

We describe the IGTree learning algorithm, which compresses an instanc e base into a tree structure. The concept of information gain is used as a heuristic function for performing this compression. IGTree produc es trees that, compared to other lazy learning approaches, reduce stor age requirements and the time required to compute classifications. Fur thermore, we obtained similar or better generalization accuracy with I GTree when trained on two complex linguistic tasks, viz. letter-phonem e transliteration and part-of-speech-tagging, when compared to alterna tive lazy learning and decision tree approaches (viz., IB1, informatio n-gain-weighted IB1, and C4.5). A third experiment, with the task of w ord hyphenation, demonstrates that when the mutual differences in info rmation gain of features is too small, IGTree as well as information-g ain-weighted IB1 perform worse than IB1. These results indicate that I GTree is a useful algorithm for problems characterized by the availabi lity of a large number of training instances described by symbolic fea tures with sufficiently differing information gain values.