R. Eklund et A. Lindstrom, Xenophones: An investigation of phone set expansion in Swedish and implications for speech recognition and speech synthesis, SPEECH COMM, 35(1-2), 2001, pp. 81-102
In recent years, both automatic speech recognition (ASR) and text-to-speech
(TTS) conversion systems have attained quality levels that allow inclusion
in everyday applications. One remaining problem to be solved in both these
types of applications is that alleged phone inventories of specific langua
ges are commonly expanded with phones from other languages, a problem that
becomes more acute in an increasingly internationalized world where multili
ngual automatic speech-based services are a desideratum. This paper investi
gates the nature of phone set expansion in Swedish. The status of these pho
nes is discussed, and since such added phones do not have a phonemic for al
lophonic) function, the term 'xenophones' is suggested. The analysis is bas
ed on a production study involving 491 subjects, and the observed xenophoni
c expansion is described in terms of three categories along the "awareness"
and the "fidelity" dimensions. The results show that very few subjects res
ort to full rephonematization and that xenophonic expansion is the rule, al
though there is an uneven distribution depending on particular phones, span
ning from phones produced by most subjects, to phones produced by almost no
subjects. Of the possible explanatory factors analyzed - regional backgrou
nd, gender, age and educational level - the latter is by far the most impor
tant. (C) 2001 Elsevier Science B.V. All rights reserved.