Ahl. West et D. Saad, ONLINE LEARNING WITH ADAPTIVE BACKPROPAGATION IN 2-LAYER NETWORKS, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics, 56(3), 1997, pp. 3426-3445
An adaptive back-propagation algorithm parametrized by an inverse temp
erature beta is studied and compared with gradient descent (standard b
ack-propagation) for on-line learning in two-layer neural networks wit
h an arbitrary number of hidden units. Within a statistical mechanics
framework, we analyze these learning algorithms in both the symmetric
and the convergence phase for finite learning rates in the case of unc
orrelated teachers of similar but arbitrary length T. These analyses s
how that adaptive back-propagation results generally in faster trainin
g by breaking the symmetry between hidden units more efficiently and b
y providing faster convergence to optimal generalization than gradient
descent.