CLASSIFICATION OF SPEECH UNDER STRESS USING TARGET DRIVEN FEATURES

Citation
Bd. Womack et Jhl. Hansen, CLASSIFICATION OF SPEECH UNDER STRESS USING TARGET DRIVEN FEATURES, Speech communication, 20(1-2), 1996, pp. 131-150
Citations number
32
Categorie Soggetti
Communication,"Language & Linguistics
Journal title
ISSN journal
01676393
Volume
20
Issue
1-2
Year of publication
1996
Pages
131 - 150
Database
ISI
SICI code
0167-6393(1996)20:1-2<131:COSUSU>2.0.ZU;2-#
Abstract
Speech production variations due to perceptually induced stress contri bute significantly to reduced speech processing performance. One appro ach for assessment of production variations due to stress is to formul ate an objective classification of speaker stress based upon the acous tic speech signal. This study proposes an algorithm for estimation of the probability of perceptually induced stress. It is suggested that t he resulting stress score could be integrated into robust speech proce ssing algorithms to improve robustness in adverse conditions. First, r esults from a previous stress classification study are employed to mot ivate selection of a targeted set of speech features on a per phoneme and stress group level. Analysis of articulatory, excitation and cepst ral based features is conducted using a previously established stresse d speech database (Speech Under Simulated and Actual Stress (SUSAS)). Stress sensitive targeted feature sets are then selected across ten st ress conditions (including Apache helicopter cockpit, Angry, Clear, Lo mbard effect, Loud, etc.) and incorporated into a new targeted neural network stress classifier. Second, the targeted feature stress classif ication system is then evaluated and shown to achieve closed speaker, open token classification rates of 91.0%. Finally, the proposed stress classification algorithm is incorporated into a stress directed speec h recognition system, where separate hidden Markov model recognizers a re trained for each stress condition. An improvement of +10.1% and +15 .4% over conventionally trained neutral and multi-style trained recogn izers is demonstrated using the new stress directed recognition approa ch.