Natural gradient descent is an on-line variable-metric optimization algorit
hm which utilizes an underlying Riemannian parameter space. We analyze the
dynamics of natural gradient descent beyond the asymptotic regime by employ
ing an exact statistical mechanics description of learning in two-layer fee
d-forward neural networks. For a realizable learning scenario we find signi
ficant improvements over standard gradient descent for both the transient a
nd asymptotic stages of learning, with a slower power law increase in learn
ing time as task complexity grows. [S0031-9007(98)07950-2].