Bayesian Belief Networks are a powerful tool for combining different k
nowledge sources with various degrees of uncertainty in a mathematical
sound and computationally efficient way. Surprisingly they have not y
et found their way into the speech processing field, despite the fact
that in this science multiple unreliable information sources exist. Th
e present paper shows how the theory can be utilized in for language m
odeling. After providing an introduction to the theory of Bayesian Net
works, we develop several extensions to the classic theory by describi
ng mechanisms for dealing with statistical dependence among daughter n
odes (usually assumed to be conditionally independent) and by providin
g a learning algorithm based on the EM-algorithm with which the probab
ilities of link matrices can be learned from example data. Using these
extensions a language model for speech recognition based on a context
-free framework is constructed. In this model, sentences are not parse
d in their entirety, as is usual with grammatical description, but onl
y ''locally'' on suitably located segments. The model was evaluated ov
er a text data base. In terms of test set entropy the model performed
at least as good as the bi/tri-gram models, while showing a good abili
ty to generalize from training to test data.