AUTOMATIC SPEECH RECOGNITION TO AID THE HEARING-IMPAIRED - PROSPECTS FOR THE AUTOMATIC-GENERATION OF CUED SPEECH

Citation
Rm. Uchanski et al., AUTOMATIC SPEECH RECOGNITION TO AID THE HEARING-IMPAIRED - PROSPECTS FOR THE AUTOMATIC-GENERATION OF CUED SPEECH, Journal of rehabilitation research and development, 31(1), 1994, pp. 20-41
Citations number
46
Categorie Soggetti
Rehabilitation,Rehabilitation
ISSN journal
07487711
Volume
31
Issue
1
Year of publication
1994
Pages
20 - 41
Database
ISI
SICI code
0748-7711(1994)31:1<20:ASRTAT>2.0.ZU;2-R
Abstract
Although great strides have been made in the development of automatic speech recognition (ASR) systems, the communication performance achiev eable with the output of current real-time speech recognition systems would be extremely poor relative to normal speech reception. An altern ate application of ASR technology to aid the hearing impaired would de rive cues from the acoustical speech signal that could be used to supp lement speechreading. We report a study of highly trained receivers of Manual Cued Speech that indicates that nearly perfect reception of ev eryday connected speech materials can be achieved at near normal speak ing rates. To understand the accuracy that might be achieved with auto matically generated cues, we measured how well trained spectrogram rea ders and an automatic speech recognizer could assign cues for various cue systems. We then applied a recently developed model of audiovisual integration to these recognizer measurements and data on human recogn ition of consonant and vowel segments via speechreading to evaluate th e benefit to speechreading provided by such cues. Our analysis suggest s that with cues derived from current recognizers, consonant and vowel segments can be received with accuracies in excess of 80%. This level of performance is roughly equivalent to the segment reception accurac y required to account for observed levels of Manual Cued Speech recept ion. Current recognizers provide maximal benefit by generating only a relatively small number (three to five) of cue groups, and may not pro vide substantially greater aid to speechreading than simpler aids that do not incorporate discrete phonetic recognition. To provide guidance for the development of improved automatic cueing systems, we describe techniques for determining optimum cue groups for a given recognizer and speechreader, and estimate the cueing performance that might be ac hieved if the performance of current recognizers were improved.