Relative loss bounds for multidimensional regression problems

Citation
J. Kivinen et Mk. Warmuth, Relative loss bounds for multidimensional regression problems, MACH LEARN, 45(3), 2001, pp. 301-329
Citations number
22
Categorie Soggetti
AI Robotics and Automatic Control
Journal title
MACHINE LEARNING
ISSN journal
08856125 → ACNP
Volume
45
Issue
3
Year of publication
2001
Pages
301 - 329
Database
ISI
SICI code
0885-6125(2001)45:3<301:RLBFMR>2.0.ZU;2-6
Abstract
We study on-line generalized linear regression with multidimensional output s, i.e., neural networks with multiple output nodes but no hidden nodes. We allow at the final layer transfer functions such as the softmax function t hat need to consider the linear activations to all the output neurons. The weight vectors used to produce the linear activations are represented indir ectly by maintaining separate parameter vectors. We get the weight vector b y applying a particular parameterization function to the parameter vector. Updating the parameter vectors upon seeing new examples is done additively, as in the usual gradient descent update. However, by using a nonlinear par ameterization function between the parameter vectors and the weight vectors , we can make the resulting update of the weight vector quite different fro m a true gradient descent update. To analyse such updates, we define a noti on of a matching loss function and apply it both to the transfer function a nd to the parameterization function. The loss function that matches the tra nsfer function is used to measure the goodness of the predictions of the al gorithm. The loss function that matches the parameterization function can b e used both as a measure of divergence between models in motivating the upd ate rule of the algorithm and as a measure of progress in analyzing its rel ative performance compared to an arbitrary fixed model. As a result, we hav e a unified treatment that generalizes earlier results for the gradient des cent and exponentiated gradient algorithms to multidimensional outputs, inc luding multiclass logistic regression.