ITA
ENG

HIGH-PERFORMANCE TRAINING OF FEEDFORWARD AND SIMPLE RECURRENT NETWORKS

Authors

KALMAN BL KWASNY SC

Citation

Bl. Kalman et Sc. Kwasny, HIGH-PERFORMANCE TRAINING OF FEEDFORWARD AND SIMPLE RECURRENT NETWORKS, Neurocomputing, 14(1), 1997, pp. 63-83

Citations number

Categorie Soggetti

Computer Sciences, Special Topics","Computer Science Artificial Intelligence",Neurosciences

Journal title

Neurocomputing → ACNP

ISSN journal

09252312

Volume

Issue

Year of publication

1997

Pages

63 - 83

Database

ISI

SICI code

0925-2312(1997)14:1<63:HTOFAS>2.0.ZU;2-8

Abstract

TRAINREC is a system for training feedforward and recurrent neural net works that incorporates several ideas. It uses the conjugate-gradient method which is demonstrably more efficient than traditional backward error propagation. We assume epoch-based training and derive a new err or function having several desirable properties absent from the tradit ional sum-of-squares-error function, We argue for skip (shortcut) conn ections where appropriate and the preference for a bipolar sigmoidal y ielding values over the [-1, 1] interval. The input feature space is o ften over-analyzed, but by using singular value decomposition, input p atterns can be conditioned for better learning often with a reduced nu mber of input units. Recurrent networks, in their most general form, r equire special handling and cannot be simply a re-wiring of the archit ecture without a corresponding revision of the derivative calculations . There is a careful balance required among the network architecture ( specifically, hidden and feedback units), the amount of training appli ed, and the ability of the network to generalize. These issues often h inge on selecting the proper stopping criterion. Discovering methods t hat work in theory as well as in practice is difficult and we have spe nt a substantial amount of effort evaluating and testing these ideas o n real problems to determine their value, This paper encapsulates a nu mber of such ideas ranging from those motivated by a desire for effici ency of training to those motivated by correctness and accuracy of the result, While this paper is intended to be self-contained, several re ferences are provided to other work upon which many of our claims are based.