Xenophones: An investigation of phone set expansion in Swedish and implications for speech recognition and speech synthesis

Citation
R. Eklund et A. Lindstrom, Xenophones: An investigation of phone set expansion in Swedish and implications for speech recognition and speech synthesis, SPEECH COMM, 35(1-2), 2001, pp. 81-102
Citations number
34
Categorie Soggetti
Computer Science & Engineering
Journal title
SPEECH COMMUNICATION
ISSN journal
01676393 → ACNP
Volume
35
Issue
1-2
Year of publication
2001
Pages
81 - 102
Database
ISI
SICI code
0167-6393(200108)35:1-2<81:XAIOPS>2.0.ZU;2-9
Abstract
In recent years, both automatic speech recognition (ASR) and text-to-speech (TTS) conversion systems have attained quality levels that allow inclusion in everyday applications. One remaining problem to be solved in both these types of applications is that alleged phone inventories of specific langua ges are commonly expanded with phones from other languages, a problem that becomes more acute in an increasingly internationalized world where multili ngual automatic speech-based services are a desideratum. This paper investi gates the nature of phone set expansion in Swedish. The status of these pho nes is discussed, and since such added phones do not have a phonemic for al lophonic) function, the term 'xenophones' is suggested. The analysis is bas ed on a production study involving 491 subjects, and the observed xenophoni c expansion is described in terms of three categories along the "awareness" and the "fidelity" dimensions. The results show that very few subjects res ort to full rephonematization and that xenophonic expansion is the rule, al though there is an uneven distribution depending on particular phones, span ning from phones produced by most subjects, to phones produced by almost no subjects. Of the possible explanatory factors analyzed - regional backgrou nd, gender, age and educational level - the latter is by far the most impor tant. (C) 2001 Elsevier Science B.V. All rights reserved.