Dynamics of learning with restricted training sets

Citation
Acc. Coolen et D. Saad, Dynamics of learning with restricted training sets, PHYS REV E, 62(4), 2000, pp. 5444-5487
Citations number
21
Categorie Soggetti
Physics
Journal title
PHYSICAL REVIEW E
ISSN journal
1063651X → ACNP
Volume
62
Issue
4
Year of publication
2000
Part
B
Pages
5444 - 5487
Database
ISI
SICI code
1063-651X(200010)62:4<5444:DOLWRT>2.0.ZU;2-F
Abstract
We study the dynamics of supervised learning in layered neural networks, in the regime where the size p of the training set is proportional to the num ber N of inputs. Here the local fields are no longer described by Gaussian probability distributions and the learning dynamics is of a spin-glass natu re, with the composition of the training set playing the role of quenched d isorder. We show how dynamical replica theory can be used to predict the ev olution of macroscopic observables, including the two relevant performance measures (training error and generalization error), incorporating the old f ormalism developed for complete training sets in the limit alpha =p/N-->inf inity as a special case. For simplicity, we restrict ourselves in this pape r to single-layer networks and realizable tasks. In the case of (on-line an d batch) Hebbian learning, where a direct exact solution is possible, we sh ow that our theory provides exact results at any time in many different ver ifiable cases. For non-Hebbian learning rules, such as PERCEPTRON and ADATR ON, we find very good agreement between the predictions of our theory and n umerical simulations. Finally, we derive three approximation schemes aimed at eliminating the need to solve a functional saddle-point equation at each time step, and we assess their performance. The simplest of these schemes leads to a fully explicit and relatively simple nonlinear diffusion equatio n for the joint field distribution, which already describes the learning dy namics surprisingly well over a wide range of parameters.