In this paper we present an algorithm for estimating state-dependent p
olynomial coefficients in the nonstationary-state hidden Markov model
(or the trended HMM) which allows for the flexibility of linear time w
arping or scaling in individual model states. The need for the state-d
ependent time warping arises from the consideration that due to speaki
ng rate variation and other temporal factors in speech, multiple state
-segmented speech data sequences used for training a single set of pol
ynomial coefficients often vary appreciably in their sequence lengths.
The algorithm is developed based on a general framework with use of a
uxiliary parameters, which, of no interests in themselves, nevertheles
s provide an intermediate tool for achieving maximal accuracy for esti
mating the polynomial coefficients in the trended HMM. It is proved th
at the proposed estimation algorithm converges to a solution equivalen
t to the state-optimized maximum likelihood estimate. Effectiveness of
the algorithm is demonstrated in experiments designed to fit a single
trended HMM simultaneously to multiple sequences of speech data which
are different renditions of the same word yet vary over a wide range
in the sequence length. Speech recognition experiments have been perfo
rmed based on the standard acoustic-phonetic TIMIT database. The speec
h recognition results demonstrate the advantages of the time-warping t
rended HMMs over the regular trended HMMs measured about 10 to 15% imp
rovement in terms of the recognition rate.