According to discourse theories in linguistics, conversational utterances p
ossess an informational structure. That is, each sentence consists of two c
omponents: the given and the new. The given refers to information that has
previously been conveyed in the conversation such as that in That's interes
ting. The new section of a sentence introduces additional information that
is new to the conversation such as the word interesting in the previous exa
mple. In this work, we take advantage of this inherent structure for the pu
rpose of automatic conversational speech recognition by building sub-senten
ce discourse language models (LMs) to represent the bi-modal nature of each
conversational sentence. The internal sentence structure is captured with
a statistical sentence model regardless of whether the input sentences are
linguistically or acoustically segmented. The proposed model is verified on
the Switchboard corpus. The resulting model contributes to a reduction in
both LM perplexity and word recognition error rate. (C) 2000 Elsevier Scien
ce B.V. All rights reserved.