Variable length Markov chains

Citation
P. Buhlmann et Aj. Wyner, Variable length Markov chains, ANN STATIST, 27(2), 1999, pp. 480-513
Citations number
25
Categorie Soggetti
Mathematics
Journal title
ANNALS OF STATISTICS
ISSN journal
00905364 → ACNP
Volume
27
Issue
2
Year of publication
1999
Pages
480 - 513
Database
ISI
SICI code
0090-5364(199904)27:2<480:VLMC>2.0.ZU;2-4
Abstract
We study estimation in the class of stationary variable length Markov chain s (VLMC) on a finite space. The processes in this class are still Markovian of high order, but with memory of variable length yielding a much bigger a nd structurally richer class of models than ordinary high-order Markov chai ns. From an algorithmic view, the VLMC model class has attracted interest i n information theory and machine learning, but statistical properties have not yet been explored. Provided that good estimation is available, the addi tional structural richness of the model class enhances predictive power by finding a better trade-off between model bias and variance and allowing bet ter structural description which can be of specific interest. The latter is exemplified with some DNA data. A version of the tree-structured context a lgorithm, proposed by Rissanen in an information theoretical set-up is show n to have new good asymptotic properties for estimation in the class of VLM Cs. This remains true even when the underlying model increases in dimension ality. Furthermore, consistent estimation of minimal state spaces and mixin g properties of fitted models are given. We also propose a new bootstrap scheme based on fitted VLMCs. We show its v alidity for quite general stationary categorical time series and for a broa d range of statistical procedures.