ITA
ENG

SPOKEN LANGUAGE IDENTIFICATION BY ERGODIC HMMS AND ITS STATE SEQUENCES

Authors

NAKAGAWA S SEINO T UEDA Y

Citation

S. Nakagawa et al., SPOKEN LANGUAGE IDENTIFICATION BY ERGODIC HMMS AND ITS STATE SEQUENCES, Electronics and communications in Japan. Part 3, Fundamental electronic science, 77(6), 1994, pp. 70-79

Citations number

Categorie Soggetti

Engineering, Eletrical & Electronic

Journal title

Electronics and communications in Japan. Part 3, Fundamental electronic science → ACNP

ISSN journal

10420967

Volume

Issue

Year of publication

1994

Pages

70 - 79

Database

ISI

SICI code

1042-0967(1994)77:6<70:SLIBEH>2.0.ZU;2-8

Abstract

This paper describes an automatic text- and speaker-independent langua ge identification method based on hidden Markov models (HMMs) for acou stic features. The hidden Markov modeling is used to represent the pho notactics for each language. Each language has its proper phonotactics . The HMM topology here is a fully structured (ergodic) model wherein any state could transit to all states. Two kinds of HMMs are Used: the discrete HMM (DHMM) with the codebook and the continuous density HMM (CHMM). The HMM was trained using both the Baum-Welch (forward-backwar d) algorithm and the Viterbi algorithm. The latter was used for emphas izing the state transition probability. For comparison, experiments al so were conducted on the identification using a mixtured Gaussian dist ribution model with one state. This single-state Gaussian distribution model gave the same performance as the HMM trained with the Baum-Welc h algorithm. This is because the transition between states which refle cts the characteristics of each language does not affect the likelihoo d scores very much. This problem was addressed by emphasizing the tran sition probabilities and using the Viterbi algorithm, which resulted i n an improvement in the recognition rates. The trigram for optimal sta te sequence is introduced. Combining it with the HMM produced the best results.