This article analyzes learning in continuous stochastic neural network
s defined by stochastic differential equations (SDE). In particular, i
t studies gradient descent learning rules to train the equilibrium sol
utions of these networks. A theorem is given that specifies sufficient
conditions for the gradient descent learning rules to be local covari
ance statistics between two random variables: (1) an evaluator that is
the same for all the network parameters and (2) a system variable tha
t is independent of the learning objective. While this article focuses
on continuous stochastic neural networks, the theorem applies to any
other system with Boltzmann-like equilibrium distributions. The genera
lity of the theorem suggests that instead of suppressing noise present
in physical devices, a natural alternative is to use it to simplify t
he credit assignment problem. In deterministic networks, credit assign
ment requires an evaluation signal that is different for each node in
the network. Surprisingly, when noise is not suppressed, all that is n
eeded is an evaluator that is the same for the entire network and a lo
cal Hebbian signal. This modularization of signals greatly simplifies
hardware and software implementations. The article shows how the theor
em applies to four different learning objectives that span supervised,
reinforcement, and unsupervised problems: (1) regression, (2) density
estimation, (3) risk minimization, and (4) information maximization.
Simulations, implementation issues, and implications for computational
neuroscience are discussed.