On the determinization of weighted finite automata

Citation
Al. Buchsbaum et al., On the determinization of weighted finite automata, SIAM J COMP, 30(5), 2000, pp. 1502-1531
Citations number
31
Categorie Soggetti
Computer Science & Engineering
Journal title
SIAM JOURNAL ON COMPUTING
ISSN journal
00975397 → ACNP
Volume
30
Issue
5
Year of publication
2000
Pages
1502 - 1531
Database
ISI
SICI code
0097-5397(2000)30:5<1502:OTDOWF>2.0.ZU;2-B
Abstract
We study the problem of constructing the deterministic equivalent of a nond eterministic weighted finite-state automaton (WFA). Determinization of WFAs has important applications in automatic speech recognition (ASR). We provi de the first polynomial-time algorithm to test for the twins property, whic h determines if a WA admits a deterministic equivalent. We also give upper bounds on the size of the deterministic equivalent; the bound is tight in t he case of acyclic WFAs. Previously, Mohri presented a superpolynomial-time algorithm to test for the twins property, and he also gave an algorithm to determinize WFAs. He showed that the latter runs in time linear in the siz e of the output when a deterministic equivalent exists; otherwise, it does not terminate. Our bounds imply an upper bound on the running time of this algorithm. Given that WFAs can expand exponentially in size when determinized, we expl ore why those that occur in ASR tend to shrink when determinized. According to ASR folklore, this phenomenon is attributable solely to the fact that A SR WFAs have simple topology, in particular, that they are acyclic and laye red. We introduce a very simple class of WFAs with this structure, but we s how that the expansion under determinization depends on the transition weig hts: some weightings cause them to shrink, while others, including random w eightings, cause them to expand exponentially. We provide experimental evid ence that ASR WFAs exhibit this weight dependence. That they shrink when de terminized, therefore, is a result of favorable weightings in addition to s pecial topology. These analyses and observations have been used to design a new, approximate WFA determinization algorithm, reported in a separate pap er along with experimental results showing that it achieves significant WA size reduction with negligible impact on ASR performance.