Ct. Hsieh et al., CONTINUOUS SPEECH SEGMENTATION BASED ON A SELF-LEARNING NEURO-FUZZY SYSTEM, IEICE transactions on fundamentals of electronics, communications and computer science, E79A(8), 1996, pp. 1180-1187
Citations number
16
Categorie Soggetti
Engineering, Eletrical & Electronic","Computer Science Hardware & Architecture","Computer Science Information Systems
For reducing requirement of large memory and minimizing computation co
mplexity in a large-vocabulary continuous speech recognition system, s
peech segmentation plays an important role in speech recognition syste
ms. In this paper, we formulate the speech segmentation as a two-phase
problem. Phase 1 (frame labeling) involves labeling frames of speech
data. Frames are classified into three types: (1) silence, (2) consona
nt and (3) vowel according to two segmentation features. In phase 2 (s
yllabic unit segmentation) we apply the concept of transition states t
o segment continuous speech data into syllabic units based on the labe
led frames. The novel class of hyperrectangular composite neural netwo
rks (HRCNNs) is used to cluster frames. The HRCNNs integrate the rule-
based approach and neural network paradigms, therefore, this special h
ybrid system may neutralize the disadvantages of each alternative. The
parameters of the trained HRCNNs are utilized to extract both crisp a
nd fuzzy classification rules. In our experiments, a database containi
ng continuous reading-rate Mandarin speech recorded from newscast was
utilized to illustrate the performance of the proposed speaker indepen
dent speech segmentation system. The effectiveness of the proposed seg
mentation system is confirmed by the experimental results.