Gd. Wu et Ct. Lin, A recurrent neural fuzzy network for word boundary detection in variable noise-level environments, IEEE SYST B, 31(1), 2001, pp. 84-97
Citations number
27
Categorie Soggetti
AI Robotics and Automatic Control
Journal title
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS
This paper discusses the problem of automatic word boundary detection in th
e presence of variable-level background noise. Commonly used robust word bo
undary detection algorithms always assume that the background noise level i
s fixed. In fact, the background noise level mag vary during the procedure
of recording. This is the major reason that most robust word boundary detec
tion algorithms cannot work well in the condition of variable background no
ise level, In order to solve this problem, we first propose a refined time-
frequency (RTF) parameter for extracting both the time and frequency featur
es of noisy speech signals. The RTF parameter extends the (time-frequency)
TF parameter proposed by Junqua et al, from single band to multiband spectr
um analysis, where the frequency bands help to make the distinction between
speech signal and noise clear. The RTF parameter can extract useful freque
ncy information, Based on this RTF parameter, we further propose a new word
boundary detection algorithm by using a recurrent sell-organizing neural f
uzzy inference network (RSONFIN). Since RSONFIN can process the temporal re
lations, the proposed RTF-based RSONFIN algorithm can find the variation of
the background noise level and detect correct word boundaries in the condi
tion of variable background noise level. As compared to normal neural netwo
rks, the RSONFIN can always find itself an economic network size with high-
learning speed, Due to the self-learning ability of RSONFIN, this RTF-based
RSONFIN algorithm avoids the need for empirically determining ambiguous de
cision rules in normal word boundary detection algorithms. Experimental res
ults show that this new algorithm achieves higher recognition rate than the
TF-based algorithm which has been shown to outperform several commonly use
d word boundary detection algorithms by about 12% in variable background no
ise level condition. It also reduces the recognition error rate due to endp
oint detection to about 23%, compared to an average of 47% obtained by the
TF-based algorithm in the same condition.