LEARNING LONG-TERM DEPENDENCIES IN NARX RECURRENT NEURAL NETWORKS

Citation
Tn. Lin et al., LEARNING LONG-TERM DEPENDENCIES IN NARX RECURRENT NEURAL NETWORKS, IEEE transactions on neural networks, 7(6), 1996, pp. 1329-1338
Citations number
40
Categorie Soggetti
Computer Application, Chemistry & Engineering","Engineering, Eletrical & Electronic","Computer Science Artificial Intelligence","Computer Science Hardware & Architecture","Computer Science Theory & Methods
ISSN journal
10459227
Volume
7
Issue
6
Year of publication
1996
Pages
1329 - 1338
Database
ISI
SICI code
1045-9227(1996)7:6<1329:LLDINR>2.0.ZU;2-#
Abstract
It has recently been shown that gradient-descent learning algorithms f or recurrent neural networks can perform poorly on tasks that involve long-term dependencies, i.e., those problems for which the desired out put depends on inputs presented at times far in the past, We show that the long-term dependencies problem is lessened fora class of architec tures called Nonlinear AutoRegressive models with eXogenous (NARX) rec urrent neural networks, which have powerful representational capabilit ies, We have previously reported that gradient descent learning can be more effective in NARX networks than in recurrent neural network arch itectures that have ''hidden states'' on problems including grammatica l inference and nonlinear system identification. Typically, the networ k converges much faster and generalizes better than other networks. Th e results in this paper are consistent with this phenomenon, We presen t some experimental results which show that NARX networks can often re tain information for two to three times as long as conventional recurr ent neural networks. We show that although NARX networks do not circum vent the problem of long-term dependencies, they can greatly improve p erformance on longterm dependency problems, We also describe in detail some of the assumption regarding what it means to latch information r obustly and suggest possible ways to loosen these assumptions.