We complement recent advances in thermodynamic limit analyses of mean
on-line gradient descent learning dynamics in multilayer networks by c
alculating fluctuations possessed by finite-dimensional systems. Fluct
uations from the mean dynamics are largest at the onset of specialisat
ion as student hidden unit weight vectors begin to imitate specific te
acher vectors, increasing with the degree of symmetry of the initial c
onditions. In light of this, we include a term to stimulate asymmetry
in the learning process, which typically also leads to a significant d
ecrease in training time.