Language simplification through error-correcting and grammatical inferencetechniques

Citation
Jc. Amengual et al., Language simplification through error-correcting and grammatical inferencetechniques, MACH LEARN, 44(1-2), 2001, pp. 143-159
Citations number
21
Categorie Soggetti
AI Robotics and Automatic Control
Journal title
MACHINE LEARNING
ISSN journal
08856125 → ACNP
Volume
44
Issue
1-2
Year of publication
2001
Pages
143 - 159
Database
ISI
SICI code
0885-6125(2001)44:1-2<143:LSTEAG>2.0.ZU;2-D
Abstract
In many language processing tasks, most of the sentences generally convey r ather simple meanings. Moreover, these tasks have a limited semantic domain that can be properly covered with a simple lexicon and a restricted syntax . Nevertheless, casual users are by no means expected to comply with any ki nd of formal syntactic restrictions due to the inherent "spontaneous" natur e of human language. In this work, the use of error-correcting-based learni ng techniques is proposed to cope with the complex syntactic variability wh ich is generally exhibited by natural language. In our approach, a complex task is modeled in terms of a basic finite state model, F, and a stochastic error model, E. F should account for the basic (syntactic) structures unde rlying this task, which would convey the meaning. E should account for gene ral vocabulary variations, word disappearance, superfluous words, and so on . Each "natural" user sentence is thus considered as a corrupted version (a ccording to E) of some "simple" sentence of L(F). Adequate bootstrapping pr ocedures are presented that incrementally improve the "structure" of F whil e estimating the probabilities for the operations of E. These techniques ha ve been applied to a practical task of moderately high syntactic variabilit y, and the results which show the potential of the proposed approach are pr esented.