Recurrent neural networks have become popular models for system identi
fication and time series prediction, Nonlinear autoregressive models w
ith exogenous inputs (NARX) neural network models are a popular subcla
ss of recurrent networks and have been used in many applications, Alth
ough embedded memory can be found in all recurrent network models, it
is particularly prominent in NARX models. We show that using intellige
nt memory order selection through pruning and good initial heuristics
significantly improves the generalization and predictive performance o
f these nonlinear systems on problems as diverse as grammatical infere
nce and time series prediction.