A new system for the automatic segmentation and labelling of speech is
presented. The system is capable of labelling speech originating from
different languages without requiring extensive linguistic knowledge
or large (manually segmented and labeled) training databases of that l
anguage. The system comprises small neural networks for the segmentati
on and the broad phonetic classification of the speech. These networks
were originally trained on one task (Flemish continuous speech), and
are automatically adapted to a new task. Due to the limited size of th
e neural networks, the segmentation and labelling strategy requires bu
t a limited amount of computations, and the adaptation to a new task c
an be accomplished very quickly. The system was first evaluated on fiv
e isolated word corpora designed for the development of Dutch, French,
American English, Spanish and Korean text-to-speech systems. The resu
lts show that the accuracy of the obtained automatic segmentation and
labelling is comparable to that of human experts. In order to provide
segmentation and labelling results which can be compared to data repor
ted in the literature, additional tests were run on TIMIT and on the E
nglish, Danish and Italian portions of the EUROMO continuous speech ut
terances. The performance of our system appears to compare favourably
to that of other systems.