Statistical language models estimate the distribution of various natural la
nguage phenomena;for the purpose of speech recognition and other language t
echnologies. Since the first significant model was proposed in 1980, many a
ttempts have been made to improve the state-of-the-art. We review them here
, point to a few promising directions, and argue for a Bayesian approach to
integration of linguistic theories with data.