ITA
ENG

FEATURE ANALYSIS AND NEURAL-NETWORK-BASED CLASSIFICATION OF SPEECH UNDER STRESS

Authors

HANSEN JHL WOMACK BD

Citation

Jhl. Hansen et Bd. Womack, FEATURE ANALYSIS AND NEURAL-NETWORK-BASED CLASSIFICATION OF SPEECH UNDER STRESS, IEEE transactions on speech and audio processing, 4(4), 1996, pp. 307-313

Citations number

Categorie Soggetti

Engineering, Eletrical & Electronic",Acoustics

Journal title

IEEE transactions on speech and audio processing → ACNP

ISSN journal

10636676

Volume

Issue

Year of publication

1996

Pages

307 - 313

Database

ISI

SICI code

1063-6676(1996)4:4<307:FAANCO>2.0.ZU;2-J

Abstract

It is well known that the variability in speech production due to task -induced stress contributes significantly to loss in speech processing algorithm performance. If an algorithm could be formulated that detec ts the presence of stress in speech, then such knowledge could be used to monitor speaker state, improve the naturalness of speech coding al gorithms, or increase the robustness of speech recognizers. The goal i n this study is to consider several speech features as potential stres s-sensitive relayers using a previously established stressed speech da tabase (SUSAS). The following speech parameters will be considered: me l, delta-mel, delta-delta-mel, auto-correlation-mel, and cross-correla tion-mel cepstral parameters, Next, an algorithm for speaker-dependent stress classification is formulated for the 11 stress conditions: Ang ry, Clear, Cond50, Cond70, Fast, Lombard, Loud, Normal, Question, Slow , and Soft, It is suggested that additional feature variations beyond neutral conditions reflect the perturbation of vocal tract articulator movement under stressed conditions. Given a robust set of features, a neural network-based classifier is formulated based on an extended de lta-bar-delta learning rule. Performance is considered for the followi ng three test scenarios: monopartition (nontargeted) and tripartition (both nontargeted and targeted) input feature vectors.