We introduce acoustic sub-word units to neural networks for speaker-indepen
dent continuous speech recognition. The functions of segmenting input and d
etecting words are implemented with networks of simple structures. The non-
uniform unit which we introduce in this research can model phoneme variatio
ns caused by co-articulation spread over several phonemes and between words
. These units can be segmented by the network according to stationary and t
ransition parts of speech without iteration or without considering all poss
ible position shifts. A word lexicon can be trained by the network, which c
an effectively memorize all transcription variations in the training uttera
nces of words. The results of speaker-independent word spotting of 520 word
s with TIMIT data are described. (C) 2000 Elsevier Science Ltd. All rights
reserved.