ITA
ENG

ORTHOGONAL TRANSFORMATIONS OF STACKED FEATURE VECTORS APPLIED TO HMM SPEECH RECOGNITION

Authors

FLAHERTY MJ ROE DB

Citation

Mj. Flaherty et Db. Roe, ORTHOGONAL TRANSFORMATIONS OF STACKED FEATURE VECTORS APPLIED TO HMM SPEECH RECOGNITION, IEE proceedings. Part I. Communications, speech and vision, 140(2), 1993, pp. 121-126

Citations number

Categorie Soggetti

Engineering, Eletrical & Electronic

Journal title

IEE proceedings. Part I. Communications, speech and vision → ACNP

ISSN journal

09563776

Volume

140

Issue

Year of publication

1993

Pages

121 - 126

Database

ISI

SICI code

0956-3776(1993)140:2<121:OTOSFV>2.0.ZU;2-3

Abstract

The paper reports improvements in speech recognition accuracy by using more sophisticated time analysis as part of the feature selection pro cess. The recognition methodology utilises hidden Markov modelling wit h continuous density functions. The authors propose using, as speech f eatures, linear transformations of the vector consisting of successive time samples of the cepstrum. Taylor series, the Legendre polynomial transform and the discrete cosine transform share several properties w ith principal components analysis. These transforms are expected to im prove speech recognition accuracy by incorporating higher-order time d erivatives (such as the second time derivative) of spectral informatio n while at the same time producing an essentially diagonal covariance. In an experimental evaluation of these ideas, accuracy in speaker-ind ependent recognition of the 'E'-set of the alphabet improved from 55%, with no time varying information, to 68%, with first-order time varyi ng information, and 74%, by including second-order time varying inform ation.