PRONUNCIATION BY ANALOGY - IMPACT OF IMPLEMENTATIONAL CHOICES ON PERFORMANCE

Citation
Ri. Damper et Jfg. Eastmond, PRONUNCIATION BY ANALOGY - IMPACT OF IMPLEMENTATIONAL CHOICES ON PERFORMANCE, Language and Speech, 40, 1997, pp. 1-23
Citations number
51
Categorie Soggetti
Language & Linguistics
Journal title
ISSN journal
00238309
Volume
40
Year of publication
1997
Part
1
Pages
1 - 23
Database
ISI
SICI code
0023-8309(1997)40:<1:PBA-IO>2.0.ZU;2-S
Abstract
Pronunciation by analogy (PbA) is an emerging, data-driven technique w ith potential application in text-to-speech (TTS) systems, as well as being an influential psychological model of reading aloud. The underly ing idea is that a pronunciation for an unknown word (i.e., one not in the dictionary, or lexicon, of the human or machine ''reader'') is as sembled by matching substrings of the input to substrings of known, le xical words, hypothesizing a partial pronunciation for each matched su bstring from the lexical knowledge of the ''reader,'' and concatenatin g the partial pronunciations. This paper assesses the capability of Pb A to derive pronunciations for unknown words of English. As a psycholo gical model, PbA is ''under-specified,'' that is, the implementor of a simulation of the process faces detailed choices which can only be re solved by trial and error. One goal for this paper is to explore the i mpact of certain basic implementational choices on the performance of PbA systems. The variables studied are the specific lexical database u sed as the basis of the analogy process, the way of ranking/scoring ca ndidate pronunciations, and the effect of manual versus automatic alig nment of letters and phonemes. When tested with short (monosyllabic) p seudowords previously used in experimental psychology studies, the low est error rate achieved is 14.3% (for a test set of size 70). We concl ude that current PbA systems are at best poor models of pseudoword pro nunciation by humans. To assess their suitability for use in a TTS app lication, in which multisyllabic words will be encountered, the implem entations have also been tested with lexical words temporarily removed from the dictionary. The best performance obtained was 93.5% phonemes correct (corresponding to 67.9% words correct) for a 16, 280-word dic tionary. This is vastly superior to the 25.7% words correct obtained u sing a set of popular letter-to-sound rules, indicating considerable s cope for analogy methods to be exploited in future TTS systems.