Combination of machine scores for automatic grading of pronunciation quality

Citation
H. Franco et al., Combination of machine scores for automatic grading of pronunciation quality, SPEECH COMM, 30(2-3), 2000, pp. 121-130
Citations number
17
Categorie Soggetti
Computer Science & Engineering
Journal title
SPEECH COMMUNICATION
ISSN journal
01676393 → ACNP
Volume
30
Issue
2-3
Year of publication
2000
Pages
121 - 130
Database
ISI
SICI code
0167-6393(200002)30:2-3<121:COMSFA>2.0.ZU;2-I
Abstract
This work is part of an effort aimed at developing computer-based systems f or language instruction; we address the task of grading the pronunciation q uality of the speech of a student of a foreign language. The automatic grad ing system uses SRI's Decipher(TM) continuous speech recognition system to generate phonetic segmentations. Based on these segmentations and probabili stic models we produce different pronunciation scores for individual or gro ups of sentences that can be used as predictors of the pronunciation qualit y. Different types of these machine scores can be combined to obtain a bett er prediction of the overall pronunciation quality. In this paper we review some of the best-performing machine scores and discuss the application of several methods based on linear and nonlinear mapping and combination of in dividual machine scores to predict the pronunciation quality grade that a h uman expert would have given. We evaluate these methods in a database that consists of pronunciation-quality-graded speech from American students spea king French. With predictors based on spectral match and on durational char acteristics, we find that the combination of scores improved the prediction of the human grades and that nonlinear mapping and combination methods per formed better than linear ones. Characteristics of the different nonlinear methods studied are discussed. (C) 2000 Elsevier Science B.V. All rights re served.