MODIFIED QUASI-NEWTON METHODS FOR TRAINING NEURAL NETWORKS

Citation
B. Robitaille et al., MODIFIED QUASI-NEWTON METHODS FOR TRAINING NEURAL NETWORKS, Computers & chemical engineering, 20(9), 1996, pp. 1133-1140
Citations number
24
Categorie Soggetti
Computer Application, Chemistry & Engineering","Engineering, Chemical","Computer Science Interdisciplinary Applications
ISSN journal
00981354
Volume
20
Issue
9
Year of publication
1996
Pages
1133 - 1140
Database
ISI
SICI code
0098-1354(1996)20:9<1133:MQMFTN>2.0.ZU;2-G
Abstract
The backpropagation algorithm is the most popular procedure to train s elf-learning feedforward neural networks. However, the convergence of this algorithm is slow, it being mainly a steepest descent method. Sev eral researchers have proposed other approaches to improve the converg ence: conjugate gradient methods, dynamic modification of learning par ameters, quasi-Newton or Newton methods, stochastic methods, etc. Quas i-Newton methods were criticized because they require significant comp utation time and memory space to perform the update of the Hessian mat rix limiting their use to middle-sized problems. This paper proposes t hree variations of the classical approach of the quasi-Newton method t hat take into account the structure of the network. By neglecting some second-order interactions, the sizes of the resulting approximated He ssian matrices are not proportional to the square of the total number of weights in the network but depend on the number of neurons of each level. The modified quasi-Newton methods are tested on two examples an d are compared to classical approaches like regular quasi-Newton metho ds, backpropagation and conjugate gradient methods. The numerical resu lts show that one of these approaches, named BFGS-N, represents a clea r gain in terms of computational time, on large-scale problems, over t he traditional methods without the requirement of large memory space. (C) 1996 Elsevier Science Ltd