State-based Gaussian selection in large vocabulary continuous speech recognition using HMM's

Citation
Mjf. Gales et al., State-based Gaussian selection in large vocabulary continuous speech recognition using HMM's, IEEE SPEECH, 7(2), 1999, pp. 152-161
Citations number
17
Categorie Soggetti
Eletrical & Eletronics Engineeing
Journal title
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING
ISSN journal
10636676 → ACNP
Volume
7
Issue
2
Year of publication
1999
Pages
152 - 161
Database
ISI
SICI code
1063-6676(199903)7:2<152:SGSILV>2.0.ZU;2-E
Abstract
This paper investigates the use of Gaussian selection (GS) to increase the speed of a large vocabulary speech recognition system. Typically, 30-70% of the computational time of a continuous density hidden Markov model-based ( HMM-based) speech recognizer is spent calculating probabilities. The aim of GS is to reduce this load by selecting the subset of Gaussian component li kelihoods that should be computed given a particular input vector. This pap er examines new techniques for obtaining "good" Gaussian subsets or "shortl ists." All the new schemes make use of state information, specifically, to which state each of the Gaussian components belongs. In this way, a maximum number of Gaussian components per state mag be specified, hence reducing t he size of the shortlist, The first technique introduced is a simple extens ion of the standard CS method, which uses this state information. Then, mor e complex schemes based on maximizing the likelihood of the training: data are proposed. These new approaches are compared with the standard GS scheme on a large vocabulary speech recognition task. On this task, the use of st ate information reduced the percentage of Gaussians computed to 10-15%, com pared with 20-30% for the standard GS scheme, with little degradation in pe rformance.