ORTHOGONAL TRANSFORMATIONS OF STACKED FEATURE VECTORS APPLIED TO HMM SPEECH RECOGNITION

Citation
Mj. Flaherty et Db. Roe, ORTHOGONAL TRANSFORMATIONS OF STACKED FEATURE VECTORS APPLIED TO HMM SPEECH RECOGNITION, IEE proceedings. Part I. Communications, speech and vision, 140(2), 1993, pp. 121-126
Citations number
8
Categorie Soggetti
Engineering, Eletrical & Electronic
ISSN journal
09563776
Volume
140
Issue
2
Year of publication
1993
Pages
121 - 126
Database
ISI
SICI code
0956-3776(1993)140:2<121:OTOSFV>2.0.ZU;2-3
Abstract
The paper reports improvements in speech recognition accuracy by using more sophisticated time analysis as part of the feature selection pro cess. The recognition methodology utilises hidden Markov modelling wit h continuous density functions. The authors propose using, as speech f eatures, linear transformations of the vector consisting of successive time samples of the cepstrum. Taylor series, the Legendre polynomial transform and the discrete cosine transform share several properties w ith principal components analysis. These transforms are expected to im prove speech recognition accuracy by incorporating higher-order time d erivatives (such as the second time derivative) of spectral informatio n while at the same time producing an essentially diagonal covariance. In an experimental evaluation of these ideas, accuracy in speaker-ind ependent recognition of the 'E'-set of the alphabet improved from 55%, with no time varying information, to 68%, with first-order time varyi ng information, and 74%, by including second-order time varying inform ation.