ITA
ENG

Hidden Markov models that use predicted secondary structures for fold recognition

Authors

Hargbo, J Elofsson, A

Citation

J. Hargbo et A. Elofsson, Hidden Markov models that use predicted secondary structures for fold recognition, PROTEINS, 36(1), 1999, pp. 68-76

Citations number

Categorie Soggetti

Biochemistry & Biophysics

Journal title

PROTEINS-STRUCTURE FUNCTION AND GENETICS

ISSN journal

08873585 → ACNP

Volume

Issue

Year of publication

1999

Pages

68 - 76

Database

ISI

SICI code

0887-3585(19990701)36:1<68:HMMTUP>2.0.ZU;2-O

Abstract

There are many proteins that share the same fold but have no clear sequence similarity To predict the structure of these proteins, so called "protein fold recognition methods" have been developed. During the last few years, i mprovements of protein fold recognition methods have been achieved through the use of predicted secondary structures (Rice and Eisenberg, J Mol Biol 1 997;267:1026-1038), as well as by using multiple sequence alignments in the form of hidden Markov models (HMM) (Karplus et al,, Proteins Suppl 1997;1: 134-139),To test the performance of different fold recognition methods, we have developed a rigorous benchmark where representatives for all proteins of known structure are matched against each other. Using this benchmark, we have compared the performance of automatically-created hidden Markov model s with standard-sequence-search methods. Further, we combine the use of pre dicted secondary structures and multiple sequence alignments into a combine d method that performs better than methods that do not use this combination of information. Using only single sequences, the correct fold of a protein was detected for 10% of the test cases in our benchmark. Including multipl e sequence information increased this number to 16%, and when predicted sec ondary structure information was included as well, the fold was correctly i dentified in 20% of the cases. Moreover, if the correct secondary structure was used, 27% Of the proteins could be correctly matched to a fold, For co mparison, blast2, fasta, and ssearch identifies the fold correctly in 13-17 % of the cases. Thus, standard pairwise sequence search methods perform alm ost as web as hidden Markov models in our benchmark. This is probably becau se the automatically-created multiple sequence alignments used in this stud y do not contain enough diversity and because the current generation of hid den Markov models do not perform very well when built from a few sequences. Proteins 1999;36:68-76. (C) 1999 Wiley-Liss, Inc.