ITA
ENG

An approximate determinization algorithm for weighted finite-state automata

Authors

Buchsbaum, AL Giancarlo, R Westbrook, JR

Citation

Al. Buchsbaum et al., An approximate determinization algorithm for weighted finite-state automata, ALGORITHMIC, 30(4), 2001, pp. 503-526

Citations number

Categorie Soggetti

Engineering Mathematics

Journal title

ALGORITHMICA

ISSN journal

01784617 → ACNP

Volume

Issue

Year of publication

2001

Pages

503 - 526

Database

ISI

SICI code

0178-4617(200108)30:4<503:AADAFW>2.0.ZU;2-I

Abstract

Nondeterministic weighted finite-stare automata are a key abstraction in au tomatic speech recognition systems. The efficiency of automatic speech reco gnition depends directly on the sizes of these automata and the degree of n ondeterminism present, so recent research has studied ways to determinize a nd minimize them, using analogues of classical automata determinization and minimization. Although, as we describe here, determinization can in the wo rst case cause poly-exponential blowup in the number of stares of a weighte d finite-state automaton, in practice it is remarkably successful. In exten sive experiments in automatic speech recognition systems, deterministic wei ghted finite-state automata tend to be smaller than the corresponding nonde terministic inputs. Our observations show that these size reductions depend critically on the interplay between weights and topology in nondeterminist ic weighted finite-state automata. We exploit these observations to design a new approximate determinization algorithm, which produces a deterministic weighted finite-state automaton that preserves the strings of a weighted l anguage but not necessarily their weights. We apply our algorithm to two di fferent types of weighted finite-state automata that occur in automatic spe ech recognition systems and in each case provide extensive experimental res ults showing that, compared with current techniques, we achieve significant size reductions without affecting performance. In particular, for a standa rd test bed, we can reduce automatic speech recognition memory requirements by 25-35% with negligible effects on recognition time and accuracy.