MICROBIAL GENE IDENTIFICATION USING INTERPOLATED MARKOV-MODELS

Citation
Sl. Salzberg et al., MICROBIAL GENE IDENTIFICATION USING INTERPOLATED MARKOV-MODELS, Nucleic acids research, 26(2), 1998, pp. 544-548
Citations number
14
Categorie Soggetti
Biology
Journal title
ISSN journal
03051048
Volume
26
Issue
2
Year of publication
1998
Pages
544 - 548
Database
ISI
SICI code
0305-1048(1998)26:2<544:MGIUIM>2.0.ZU;2-U
Abstract
This paper describes a new system, GLIMMER, for finding genes in micro bial genomes. In a series of tests on Haemophilus influenzae, Helicoba cter pylori and other complete microbial genomes, this system has prov en to be very accurate at locating virtually ail the genes in these se quences, outperforming previous methods. A conservative estimate based on experiments on H.pylori and H.influenzae is that the system finds >97% of all genes, GLIMMER uses Interpolated Markov models (IMMs) as a framework for capturing dependencies between nearby nucleotides in a DNA sequence. An IMM-based method makes predictions based on a variabl e context; i.e., a variable-length oligomer in a DNA sequence, The con text used by GLIMMER changes depending on the local composition of the sequence, As a result, GLIMMER is more flexible and more powerful tha n fixed-order Markov methods, which have previously been the primary c ontent-based technique for finding genes in microbial DNA.