Distributed representations for extended syntactic transformation

Citation
L. Niklasson et F. Linaker, Distributed representations for extended syntactic transformation, CONNECT SCI, 12(3-4), 2000, pp. 299-314
Citations number
38
Categorie Soggetti
AI Robotics and Automatic Control
Journal title
CONNECTION SCIENCE
ISSN journal
09540091 → ACNP
Volume
12
Issue
3-4
Year of publication
2000
Pages
299 - 314
Database
ISI
SICI code
0954-0091(200012)12:3-4<299:DRFEST>2.0.ZU;2-4
Abstract
This paper shows how the choice of representation substantially affects the generalization performance of connectionist networks. The starting point i s Chalmers' simulations involving structure-sensitive processing. Chalmers argued that a connectionist network could handle structure sensitive proces sing without the use of syntactically structured representations. He traine d a connectionist architecture to encode/decode distributed representations for simple sentences. These distributed representations were then holistic ally transformed such that active sentences were transformed into their pas sive counterpart. However, he noted that the recursive auto-associative mem ory (RAAM), which was used to encode and decode distributed representations for the structures, exhibited only a limited ability to generalize when tr ained to encode/decode a randomly selected sample of the total corpus. When the RAAM was trained to encode/decode all sentences, and a separate transf ormation network was trained to make some active-passive transformations of the RAAM-encoded sentences, the transformation network demonstrated perfec t generalization on the remaining test sentences. It is argued here that th e main reason for the limited generalization is not the ability of the RAAM architecture per se, but the choice of representation for the tokens used. This paper shows that 100% generalization can be achieved for Chalmers' or iginal set up (i.e. using only 30% of the total corpus for training). The k ey to this success is to use distributed representations for the tokens (ca pturing different characteristics for different classes of tokens, e.g. ver bs or nouns).