We formulate a learning algorithm for online learning in neural networks us
ing the extended Kalman filter approach, providing a principled and practic
able approximation to the full Bayesian treatment. The latter, which consti
tutes optimal learning, does not require artificial setting of training par
ameters and allows for the estimation of a wide range of quantities of inte
rest. We analyse the performance of the algorithm using tools of statistica
l physics in several scenarios: we look at drifting rules represented by li
near and nonlinear perceptrons and investigate how different prior settings
affect the generalization performance as well as learnability itself. We i
nvestigate the learning behaviour of stationary two-layer network, where th
e algorithm seems to avoid the, otherwise common, problem of long symmetric
plateaus.