Sv. Vaseghi et al., SPEECH MODELING USING CEPSTRAL-TIME FEATURE MATRICES IN HIDDEN MARKOV-MODELS, IEE proceedings. Part I. Communications, speech and vision, 140(5), 1993, pp. 317-320
The paper explores the use of 2-dimensional cepstral-time features for
the utilisation of correlation among successive speech spectral vecto
rs, within a hidden-Markov-mode (HMM) framework. A cepstral-time-featu
re matrix is obtained from a 2-dimensional discrete cosine transform o
f a spectral-time matrix. Advantages of cepstral-time features are tha
t cepstral-time-feature matrices are a simple and robust method of rep
resenting short-time variation of speech spectral parameters; a cepstr
al-time matrix contains information on the transitional dynamics of fe
ature vectors within the matrix; speech recognition based on cepstral
time matrices is more robust in noisy environments; and use of a matri
x of M cepstral vectors implies a minimum HMM-state duration constrain
t of M vector units. A simple framework investigated in the paper for
applications of cepstral-time features is a finite-state-matrix quanti
ser (FSMQ), a special case of the HMM. It is used for initialisation o
f the training phase of HMMs.