Forgetting exceptions is harmful in language learning

Citation
W. Daelemans et al., Forgetting exceptions is harmful in language learning, MACH LEARN, 34(1-3), 1999, pp. 11-41
Citations number
68
Categorie Soggetti
AI Robotics and Automatic Control
Journal title
MACHINE LEARNING
ISSN journal
08856125 → ACNP
Volume
34
Issue
1-3
Year of publication
1999
Pages
11 - 41
Database
ISI
SICI code
0885-6125(199902)34:1-3<11:FEIHIL>2.0.ZU;2-9
Abstract
We show that in language learning, contrary to received wisdom, keeping exc eptional training instances in memory can be beneficial for generalization accuracy. We investigate this phenomenon empirically on a selection of benc hmark natural language processing tasks: grapheme-to-phoneme conversion, pa rt-of-speech tagging, prepositional-phrase attachment, and base noun phrase chunking. In a first series of experiments we combine memory-based learnin g with training set editing techniques, in which instances are edited based on their typicality and class prediction strength. Results show that editi ng exceptional instances (with low typicality or low class prediction stren gth) tends to harm generalization accuracy. In a second series of experimen ts we compare memory-based learning and decision-tree learning methods on t he same selection of tasks, and find that decision-tree learning often perf orms worse than memory-based learning. Moreover, the decrease in performanc e can be linked to the degree of abstraction from exceptions (i.e., pruning or eagerness). We provide explanations for both results in terms of the pr operties of the natural language processing tasks and the learning algorith ms.