NOISE INJECTION - THEORETICAL PROSPECTS

Citation
Y. Grandvalet et al., NOISE INJECTION - THEORETICAL PROSPECTS, Neural computation, 9(5), 1997, pp. 1093-1108
Citations number
25
Categorie Soggetti
Computer Sciences","Computer Science Artificial Intelligence",Neurosciences
Journal title
ISSN journal
08997667
Volume
9
Issue
5
Year of publication
1997
Pages
1093 - 1108
Database
ISI
SICI code
0899-7667(1997)9:5<1093:NI-TP>2.0.ZU;2-Z
Abstract
Noise injection consists of adding noise to the inputs during neural n etwork training. Experimental results suggest that it might improve th e generalization ability of the resulting neural network. A justificat ion of this improvement remains elusive: describing analytically the a verage perturbed cost function is difficult, and controlling the fluct uations of the random perturbed cost function is hard. Hence, recent p apers suggest replacing the random perturbed cost by a (deterministic) Taylor approximation of the average perturbed cost function. This art icle takes a different stance: when the injected noise is gaussian, no ise injection is naturally connected to the action of the heat kernel. This provides indications on the relevance domain of traditional Tayl or expansions and shows the dependence of the quality of Taylor approx imations on global smoothness properties of neural networks under cons ideration. The connection between noise injection and heat kernel also enables controlling the fluctuations of the random perturbed cost fun ction. Under the global smoothness assumption, tools from gaussian ana lysis provide bounds on the tail behavior of the perturbed cost. This finally suggests that mixing input perturbation with smoothness-based penalization might be profitable.