Based on an observation about the different effect of ensemble averagi
ng on the bias and variance portions of the prediction error, we discu
ss training methodologies for ensembles of networks. We demonstrate th
e effect of variance reduction and present a method of extrapolation t
o the limit of an infinite ensemble. A significant reduction of varian
ce is obtained by averaging just over initial conditions of the neural
networks, without varying architectures or training sets. The minimum
of the ensemble prediction error is reached later than that of a sing
le network. In the vicinity of the minimum, the ensemble prediction er
ror appears to be flatter than that of the single network, thus simpli
fying optimal stopping decision. The results are demonstrated on sunsp
ots data, where the predictions are among the best obtained, and on th
e 1993 energy prediction competition data set B.