R. Reed et al., SIMILARITIES OF ERROR REGULARIZATION, SIGMOID GAIN SCALING, TARGET SMOOTHING, AND TRAINING WITH JITTER, IEEE transactions on neural networks, 6(3), 1995, pp. 529-538
The generalization performance of feedforward layered perceptrons can,
in many cases, be improved either by smoothing the target via convolu
tion, regularizing the training error with a smoothing constraint, dec
reasing the gain (i.e., slope) of the sigmoid nonlinearities, or addin
g noise (i.e., jitter) to the input training data, In certain importan
t cases, the results of these procedures yield highly similar results
although at different costs. Training with jitter, for example, requir
es significantly more computation than sigmoid scaling.