ITA
ENG

NEURAL NETWORKS AND THE TIME-SLICED PARADIGM FOR SPEECH RECOGNITION

Authors

KIRSCHNING I AOE JI

Citation

I. Kirschning et Ji. Aoe, NEURAL NETWORKS AND THE TIME-SLICED PARADIGM FOR SPEECH RECOGNITION, IEICE transactions on information and systems, E79D(12), 1996, pp. 1690-1699

Citations number

Categorie Soggetti

Computer Science Information Systems

Journal title

IEICE transactions on information and systems → ACNP

ISSN journal

09168532

Volume

E79D

Issue

Year of publication

1996

Pages

1690 - 1699

Database

ISI

SICI code

0916-8532(1996)E79D:12<1690:NNATTP>2.0.ZU;2-F

Abstract

The Time-Slicing paradigm is a newly developed method for the training of neural networks for speech recognition. The neural net is trained to spot the syllables in a continuous stream of speech. It generates a transcription of the utterance, be it a word, a phrase, etc. Combined with a simple error recovery method the desired units (words or phras es) can be retrieved. This paradigm uses a recurrent neural network tr ained in a modular fashion with natural connectionist glue. It process es the input signal sequentially regardless of the input's length and immediately extracts the syllables spotted in the speech stream. As an example, this character string is then compared to a set of possible words, picking out the five closest candidates. In this paper we descr ibe the time-slicing paradigm and the training of the recurrent neural network together with details about the training samples. It also int roduces the concept of natural connectionist glue and the recurrent ne ural network's architecture used for this purpose. Additionally we exp lain the errors found in the output and the process to reduce them and recover the correct words. The recognition rates of the network and t he recovery rates for the words are also shown. The presented examples and recognition rates demonstrate the potential of the time-slicing m ethod for continuous speech recognition.