ITA
ENG

Stochastic pronunciation modelling from hand-labelled phonetic corpora

Authors

Riley, M Byrne, W Finke, M Khudanpur, S Ljolje, A McDonough, J Nock, H Saraclar, M Wooters, C Zavaliagkos, G

Citation

M. Riley et al., Stochastic pronunciation modelling from hand-labelled phonetic corpora, SPEECH COMM, 29(2-4), 1999, pp. 209-224

Citations number

Categorie Soggetti

Computer Science & Engineering

Journal title

SPEECH COMMUNICATION

ISSN journal

01676393 → ACNP

Volume

Issue

2-4

Year of publication

1999

Pages

209 - 224

Database

ISI

SICI code

0167-6393(199911)29:2-4<209:SPMFHP>2.0.ZU;2-W

Abstract

In the early 1990s, the availability of the TIMIT read-speech phonetically transcribed corpus led to work at AT&T on the automatic inference of pronun ciation variation. This work, briefly summarized here, used stochastic deci sion trees trained on phonetic and linguistic features, and was applied to the DARPA North American Business News read-speech ASR task. More recently, the ICSI spontaneous-speech phonetically transcribed corpus was collected at the behest of the 1996 and 1997 LVCSR Summer Workshops held at Johns Hop kins University. A 1997 workshop (WS97) group focused on pronunciation infe rence from this corpus for application to the DoD Switchboard spontaneous t elephone speech ASR task. We describe several approaches taken there. These include (1) one analogous to the AT&T approach, (2) one, inspired by work at WS96 and CMU, that involved adding pronunciation variants of a sequence of one or more words ('multiwords') in the corpus (with corpus-derived prob abilities) into the ASR lexicon, and (1 + 2) a hybrid approach in which a d ecision-tree model was used to automatically phonetically transcribe a much larger speech corpus than ICSI and then the multiword approach was used to construct an ASR recognition pronunciation lexicon. (C) 1999 Elsevier Scie nce B.V. All rights reserved.