ITA
ENG

Distributed representations for extended syntactic transformation

Authors

Niklasson, L Linaker, F

Citation

L. Niklasson et F. Linaker, Distributed representations for extended syntactic transformation, CONNECT SCI, 12(3-4), 2000, pp. 299-314

Citations number

Categorie Soggetti

AI Robotics and Automatic Control

Journal title

CONNECTION SCIENCE

ISSN journal

09540091 → ACNP

Volume

Issue

3-4

Year of publication

2000

Pages

299 - 314

Database

ISI

SICI code

0954-0091(200012)12:3-4<299:DRFEST>2.0.ZU;2-4

Abstract

This paper shows how the choice of representation substantially affects the generalization performance of connectionist networks. The starting point i s Chalmers' simulations involving structure-sensitive processing. Chalmers argued that a connectionist network could handle structure sensitive proces sing without the use of syntactically structured representations. He traine d a connectionist architecture to encode/decode distributed representations for simple sentences. These distributed representations were then holistic ally transformed such that active sentences were transformed into their pas sive counterpart. However, he noted that the recursive auto-associative mem ory (RAAM), which was used to encode and decode distributed representations for the structures, exhibited only a limited ability to generalize when tr ained to encode/decode a randomly selected sample of the total corpus. When the RAAM was trained to encode/decode all sentences, and a separate transf ormation network was trained to make some active-passive transformations of the RAAM-encoded sentences, the transformation network demonstrated perfec t generalization on the remaining test sentences. It is argued here that th e main reason for the limited generalization is not the ability of the RAAM architecture per se, but the choice of representation for the tokens used. This paper shows that 100% generalization can be achieved for Chalmers' or iginal set up (i.e. using only 30% of the total corpus for training). The k ey to this success is to use distributed representations for the tokens (ca pturing different characteristics for different classes of tokens, e.g. ver bs or nouns).