Noise injection consists of adding noise to the inputs during neural n
etwork training. Experimental results suggest that it might improve th
e generalization ability of the resulting neural network. A justificat
ion of this improvement remains elusive: describing analytically the a
verage perturbed cost function is difficult, and controlling the fluct
uations of the random perturbed cost function is hard. Hence, recent p
apers suggest replacing the random perturbed cost by a (deterministic)
Taylor approximation of the average perturbed cost function. This art
icle takes a different stance: when the injected noise is gaussian, no
ise injection is naturally connected to the action of the heat kernel.
This provides indications on the relevance domain of traditional Tayl
or expansions and shows the dependence of the quality of Taylor approx
imations on global smoothness properties of neural networks under cons
ideration. The connection between noise injection and heat kernel also
enables controlling the fluctuations of the random perturbed cost fun
ction. Under the global smoothness assumption, tools from gaussian ana
lysis provide bounds on the tail behavior of the perturbed cost. This
finally suggests that mixing input perturbation with smoothness-based
penalization might be profitable.