ITA
ENG

Relative loss bounds for multidimensional regression problems

Authors

Kivinen, J Warmuth, MK

Citation

J. Kivinen et Mk. Warmuth, Relative loss bounds for multidimensional regression problems, MACH LEARN, 45(3), 2001, pp. 301-329

Citations number

Categorie Soggetti

AI Robotics and Automatic Control

Journal title

MACHINE LEARNING

ISSN journal

08856125 → ACNP

Volume

Issue

Year of publication

2001

Pages

301 - 329

Database

ISI

SICI code

0885-6125(2001)45:3<301:RLBFMR>2.0.ZU;2-6

Abstract

We study on-line generalized linear regression with multidimensional output s, i.e., neural networks with multiple output nodes but no hidden nodes. We allow at the final layer transfer functions such as the softmax function t hat need to consider the linear activations to all the output neurons. The weight vectors used to produce the linear activations are represented indir ectly by maintaining separate parameter vectors. We get the weight vector b y applying a particular parameterization function to the parameter vector. Updating the parameter vectors upon seeing new examples is done additively, as in the usual gradient descent update. However, by using a nonlinear par ameterization function between the parameter vectors and the weight vectors , we can make the resulting update of the weight vector quite different fro m a true gradient descent update. To analyse such updates, we define a noti on of a matching loss function and apply it both to the transfer function a nd to the parameterization function. The loss function that matches the tra nsfer function is used to measure the goodness of the predictions of the al gorithm. The loss function that matches the parameterization function can b e used both as a measure of divergence between models in motivating the upd ate rule of the algorithm and as a measure of progress in analyzing its rel ative performance compared to an arbitrary fixed model. As a result, we hav e a unified treatment that generalizes earlier results for the gradient des cent and exponentiated gradient algorithms to multidimensional outputs, inc luding multiclass logistic regression.