DIFFUSION OF CONTEXT AND CREDIT INFORMATION IN MARKOVIAN MODELS

Citation
Y. Bengio et P. Frasconi, DIFFUSION OF CONTEXT AND CREDIT INFORMATION IN MARKOVIAN MODELS, The journal of artificial intelligence research, 3, 1995, pp. 249-270
Citations number
26
Categorie Soggetti
Controlo Theory & Cybernetics","Computer Science Artificial Intelligence
ISSN journal
10769757
Volume
3
Year of publication
1995
Pages
249 - 270
Database
ISI
SICI code
1076-9757(1995)3:<249:DOCACI>2.0.ZU;2-0
Abstract
This paper studies the problem of ergodicity of transition probability matrices in Markovian models, such as hidden Markov models (HMMs), an d how it makes very difficult the task of learning to represent long-t erm context for sequential data. This phenomenon hurts the forward pro pagation of long-term context information, as well as learning a hidde n state representation to represent long-term context, which depends o n propagating credit information backwards in time. Using results from Markov chain theory, we show that this problem of diffusion of contex t and credit is reduced when the transition probabilities approach 0 o r 1, i.e., the transition probability matrices are sparse and the mode l essentially deterministic. The results found in this paper apply to learning approaches based on continuous optimization, such as gradient descent and the Baum-Welch algorithm.