Y. Bengio et P. Frasconi, DIFFUSION OF CONTEXT AND CREDIT INFORMATION IN MARKOVIAN MODELS, The journal of artificial intelligence research, 3, 1995, pp. 249-270
Citations number
26
Categorie Soggetti
Controlo Theory & Cybernetics","Computer Science Artificial Intelligence
This paper studies the problem of ergodicity of transition probability
matrices in Markovian models, such as hidden Markov models (HMMs), an
d how it makes very difficult the task of learning to represent long-t
erm context for sequential data. This phenomenon hurts the forward pro
pagation of long-term context information, as well as learning a hidde
n state representation to represent long-term context, which depends o
n propagating credit information backwards in time. Using results from
Markov chain theory, we show that this problem of diffusion of contex
t and credit is reduced when the transition probabilities approach 0 o
r 1, i.e., the transition probability matrices are sparse and the mode
l essentially deterministic. The results found in this paper apply to
learning approaches based on continuous optimization, such as gradient
descent and the Baum-Welch algorithm.