The two-layer radial basis function network, with fixed centers of the
basis functions, is analyzed within a stochastic training paradigm. V
arious definitions of generalization error are considered, and two suc
h definitions are employed in deriving generic learning curves and gen
eralization properties, both with and without a weight decay term. The
generalization error is shown analytically to be related to the evide
nce and, via the evidence, to the prediction error and free energy. Th
e generalization behavior is explored; the generic learning curve is f
ound to be inversely proportional to the number of training pairs pres
ented. Optimization of training is considered by minimizing the genera
lization error with respect to the free parameters of the training alg
orithms. Finally, the effect of the joint activations between hidden-l
ayer units is examined and shown to speed training.