A DECISION-THEORETIC EXTENSION OF STOCHASTIC COMPLEXITY AND ITS APPLICATIONS TO LEARNING

Authors
Citation
K. Yamanishi, A DECISION-THEORETIC EXTENSION OF STOCHASTIC COMPLEXITY AND ITS APPLICATIONS TO LEARNING, IEEE transactions on information theory, 44(4), 1998, pp. 1424-1439
Citations number
29
Categorie Soggetti
Computer Science Information Systems","Engineering, Eletrical & Electronic","Computer Science Information Systems
ISSN journal
00189448
Volume
44
Issue
4
Year of publication
1998
Pages
1424 - 1439
Database
ISI
SICI code
0018-9448(1998)44:4<1424:ADEOSC>2.0.ZU;2-8
Abstract
Rissanen has introduced stochastic complexity to define the amount of information in a given data sequence relative to a given hypothesis cl ass of probability densities, where the information is measured in ter ms of the logarithmic loss associated with universal data compression. This paper introduces the notion of extended stochastic complexity (E SC) and demonstrates its effectiveness in design and analysis of learn ing algorithms in on-line prediction and batch-learning scenarios. ESC can be thought of as an extension of Rissanen's stochastic complexity to the decision-theoretic setting where a general real-valued functio n is used as a hypothesis and a general loss function is used as a dis tortion measure. As an application of ESC to online prediction, this p aper shows that a sequential realization of ESC produces an on-line pr ediction algorithm called Vovk's aggregating strategy, which can be th ought of as an extension of the Bayes algorithm. We derive upper bound s on the cumulative loss for the aggregating strategy both of an expec ted form and a worst case form in the case where the hypothesis class is continuous. As an application of ESC to batch-learning, this paper shows that a batch-approximation of ESC induces a batch-learning algor ithm called the minimum L-complexity algorithm (MLC), which is an exte nsion of the minimum description length (MDL) principle. We derive upp er bounds on the statistical risk for MLC, which are least to date. Th rough ESC we give a unifying view of the most effective learning algor ithms that have recently been explored in computational learning theor y.