We analyze and compare the well-known gradient descent algorithm and the mo
re recent exponentiated gradient algorithm for training a single neuron wit
h an arbitrary transfer function. Both algorithms are easily generalized to
larger neural networks, and the generalization of gradient descent is the
standard backpropagation algorithm. In this paper we prove worst-case loss
bounds for both algorithms in the single neuron case. Since local minima ma
ke it difficult to prove worst-case bounds for gradient-based algorithms, w
e must use a loss function that prevents the formation of spurious local mi
nima. We define such a matching loss function for any strictly increasing d
ifferentiable transfer function and prove worst-case loss bounds for any su
ch transfer function and its corresponding matching loss. For example, the
matching loss for the identity function is the square loss and the matching
loss for the logistic transfer function is the entropic loss. The differen
t forms of the two algorithms' bounds indicates that exponentiated gradient
outperforms gradient descent when the inputs contain a large number of irr
elevant components. Simulations on synthetic data confirm these analytical
results.