ITA
ENG

A DECISION-THEORETIC EXTENSION OF STOCHASTIC COMPLEXITY AND ITS APPLICATIONS TO LEARNING

Authors

YAMANISHI K

Citation

K. Yamanishi, A DECISION-THEORETIC EXTENSION OF STOCHASTIC COMPLEXITY AND ITS APPLICATIONS TO LEARNING, IEEE transactions on information theory, 44(4), 1998, pp. 1424-1439

Citations number

Categorie Soggetti

Computer Science Information Systems","Engineering, Eletrical & Electronic","Computer Science Information Systems

Journal title

IEEE transactions on information theory → ACNP

ISSN journal

00189448

Volume

Issue

Year of publication

1998

Pages

1424 - 1439

Database

ISI

SICI code

0018-9448(1998)44:4<1424:ADEOSC>2.0.ZU;2-8

Abstract

Rissanen has introduced stochastic complexity to define the amount of information in a given data sequence relative to a given hypothesis cl ass of probability densities, where the information is measured in ter ms of the logarithmic loss associated with universal data compression. This paper introduces the notion of extended stochastic complexity (E SC) and demonstrates its effectiveness in design and analysis of learn ing algorithms in on-line prediction and batch-learning scenarios. ESC can be thought of as an extension of Rissanen's stochastic complexity to the decision-theoretic setting where a general real-valued functio n is used as a hypothesis and a general loss function is used as a dis tortion measure. As an application of ESC to online prediction, this p aper shows that a sequential realization of ESC produces an on-line pr ediction algorithm called Vovk's aggregating strategy, which can be th ought of as an extension of the Bayes algorithm. We derive upper bound s on the cumulative loss for the aggregating strategy both of an expec ted form and a worst case form in the case where the hypothesis class is continuous. As an application of ESC to batch-learning, this paper shows that a batch-approximation of ESC induces a batch-learning algor ithm called the minimum L-complexity algorithm (MLC), which is an exte nsion of the minimum description length (MDL) principle. We derive upp er bounds on the statistical risk for MLC, which are least to date. Th rough ESC we give a unifying view of the most effective learning algor ithms that have recently been explored in computational learning theor y.