The power of vacillation in language learning

Authors
Citation
J. Case, The power of vacillation in language learning, SIAM J COMP, 28(6), 1999, pp. 1941-1969
Citations number
99
Categorie Soggetti
Computer Science & Engineering
Journal title
SIAM JOURNAL ON COMPUTING
ISSN journal
00975397 → ACNP
Volume
28
Issue
6
Year of publication
1999
Pages
1941 - 1969
Database
ISI
SICI code
0097-5397(19990817)28:6<1941:TPOVIL>2.0.ZU;2-T
Abstract
Some extensions are considered of Gold's influential model of language lear ning by machine from positive data. Studied are criteria of successful lear ning featuring convergence in the limit to vacillation between several alte rnative correct grammars. The main theorem of this paper is that there are classes of languages that can be learned if convergence in the limit to up to (n + 1) exactly correct grammars is allowed but which cannot be learned if convergence in the limit is to no more than n grammars, where the no mor e than n grammars can each make finitely many mistakes. This contrasts shar ply with results of Barzdin and Podnieks and, later, Case and Smith for lea rnability from both positive and negative data. A subset principle from a 1980 paper of Angluin is extended to the vacillat ory and other criteria of this paper. This principle provides a necessary c ondition for avoiding overgeneralization in learning from positive data. It is applied to prove another theorem to the effect that one can optimally e liminate half of the mistakes from final programs for vacillatory criteria if one is willing to converge in the limit to infinitely many different pro grams instead. Child language learning may be sensitive to the order or timing of data pre sentation. It is shown, though, that for the vacillatory success criteria o f this paper, there is no loss of learning power for machines which are ins ensitive to order in several ways simultaneously. For example, partly set-d riven machines attend only to the set and length of sequence of positive da ta, not the actual sequence itself. A machine M is weakly n-ary order indep endent def, for each language L on which, for some ordering of the positive data about L, M converges in the limit to a finite set of grammars, there is a finite set of grammars D (of cardinality less than or equal to n) such that M converges to a subset of this same D for each ordering of the posit ive data for L. The theorem most difficult to prove in the paper implies th at machines which are simultaneously partly set-driven and weakly n-ary ord er independent do not lose learning power for converging in the limit to up to n grammars. Several variants of this theorem are obtained by modifying its proof, and some of these variants have application in this and other pa pers. Along the way it is also shown, for the vacillatory criteria, that le arning power is not increased if one restricts the sequence of positive dat a presentation to be computable. Some of these results are nontrivial lifts of prior work for the n = 1 case done by the Blums; Wiehagen; Osherson, St ob, and Weinstein; Schafer; and Fulk.