In spoken document retrieval (SDR), speech recognition is applied to a coll
ection to obtain either words or subword units, such as phonemes, that can
be matched against queries. We have explored retrieval based on phoneme n-g
rams. The use of phonemes addresses the out-of-vocabulary (OOV) problem, wh
ile use of n-grams allows approximate matching on inaccurate phoneme transc
riptions. Our experiments explored the utility of word boundary information
, stopword elimination, query expansion, varying the length of phoneme sequ
ences to be matched and various combinations of n-grams of different length
s. Given word-based recognition (WBR), we can match queries to speech using
a phoneme representation of the words, permitting us to test whether it wa
s the recognition or the matching process that was most crucial to retrieva
l performance. Our experiments show that there is some deterioration in eff
ectiveness, but the particular form of matching is less vital if the sequen
ce of phonemes was correct. When phone sequences are recognised directly, w
ith higher error rates than for words, it was more important to select a go
od matching approach. Varying gram length trades precision against recall;
combination of n-grams of different lengths, in particular 3-grams and 4-gr
ams, can improve retrieval. Overall, phoneme-based retrieval is not as effe
ctive as word-based retrieval, but is sufficient for situations in which wo
rd-based retrieval is either impractical or undesirable. (C) 2000 Elsevier
Science B.V. All rights reserved.