Analysis of natural gradient descent for multilayer neural networks

Citation
M. Rattray et D. Saad, Analysis of natural gradient descent for multilayer neural networks, PHYS REV E, 59(4), 1999, pp. 4523-4532
Citations number
22
Categorie Soggetti
Physics
Journal title
PHYSICAL REVIEW E
ISSN journal
1063651X → ACNP
Volume
59
Issue
4
Year of publication
1999
Pages
4523 - 4532
Database
ISI
SICI code
1063-651X(199904)59:4<4523:AONGDF>2.0.ZU;2-T
Abstract
Natural gradient descent is a principled method for adapting the parameters of a statistical model on-line using an underlying Riemannian parameter sp ace to redefine the direction of steepest descent. The algorithm is examine d via methods of statistical physics that accurately characterize both tran sient and asymptotic behavior. A solution of the learning dynamics is obtai ned for the case of multilayer neural network training in the limit of larg e input dimension. We find that natural gradient learning leads to optimal asymptotic performance and outperforms gradient descent in the transient, s ignificantly shortening or even removing plateaus in the transient generali zation performance that typically hamper gradient descent training.