A comprehensive evaluation of five speech synthesis systems covering t
hree different auditory evaluation methods is presented. The purpose o
f the tests is to assess the signal generation part (i.e. the synthesi
zer itself) and the prosody generation module of two rule-based and th
ree template-based synthesis systems for German. Different aspects of
speech quality and distinct Linguistic levels are addressed by applyin
g the following test procedures: the Cluster-Identification Test for t
he assessment of segmental intelligibility (TEST1), a Paired Compariso
n Test for the measurement of general acceptance at the sentence level
(TEST2) and a Category Rating Test using eight attributes at the para
graph level (TEST3). First each test is presented separately. Then all
test results are related to each other to review the separate interpr
etations of the respective results as well as to elicit the main dimen
sions of speech quality. Finally, some conclusions are drawn from thes
e results regarding speech synthesis improvement. In this context the
German Verbmobil project is envisaged, this investigation being a pilo
t task of the speech synthesis work group.