We study the dynamics of supervised learning in layered neural networks, in
the regime where the size p of the training set is proportional to the num
ber N of inputs. Here the local fields are no longer described by Gaussian
probability distributions and the learning dynamics is of a spin-glass natu
re, with the composition of the training set playing the role of quenched d
isorder. We show how dynamical replica theory can be used to predict the ev
olution of macroscopic observables, including the two relevant performance
measures (training error and generalization error), incorporating the old f
ormalism developed for complete training sets in the limit alpha =p/N-->inf
inity as a special case. For simplicity, we restrict ourselves in this pape
r to single-layer networks and realizable tasks. In the case of (on-line an
d batch) Hebbian learning, where a direct exact solution is possible, we sh
ow that our theory provides exact results at any time in many different ver
ifiable cases. For non-Hebbian learning rules, such as PERCEPTRON and ADATR
ON, we find very good agreement between the predictions of our theory and n
umerical simulations. Finally, we derive three approximation schemes aimed
at eliminating the need to solve a functional saddle-point equation at each
time step, and we assess their performance. The simplest of these schemes
leads to a fully explicit and relatively simple nonlinear diffusion equatio
n for the joint field distribution, which already describes the learning dy
namics surprisingly well over a wide range of parameters.