Segmentation of yeast DNA using hidden Markov models

Citation
L. Peshkin et Ms. Gelfand, Segmentation of yeast DNA using hidden Markov models, BIOINFORMAT, 15(12), 1999, pp. 980-986
Citations number
18
Categorie Soggetti
Multidisciplinary
Journal title
BIOINFORMATICS
ISSN journal
13674803 → ACNP
Volume
15
Issue
12
Year of publication
1999
Pages
980 - 986
Database
ISI
SICI code
1367-4803(199912)15:12<980:SOYDUH>2.0.ZU;2-6
Abstract
Motivation: Compositionally homogeneous segments of genomic DNA often corre spond to meaningful biological units. Simple sliding window analysis is usu ally insufficient for compositional segmentation of natural sequences. Hidd en Markov models (HMM) with a small number of states are a natural language for description of compositional properties of chromosome-size DNA sequenc es. Results: The algorithms were applied to yeast Saccharomyces cerevisiae chro mosomes (YC) I, III, IV, VI and IX. The optimal number of HMM states is fou nd to be four. The optimal four-state HMMs far all chromosomes are very sim ilar; as well as the reconstructed segmentations. In most cases the models with k + 1 states are obtained by 'splitting' one of the states in the mode l with k states, and the corresponding increase of the level of detail in s egmentation. The high AT states usually correspond to intergenic regions. W e also explore the model's likelihood landscape and analyze the dynamics of the optimization process, thus addressing the problem of reliability of th e obtained optima and efficiency of the algorithms.