PROTEIN TOPOLOGY RECOGNITION FROM SECONDARY STRUCTURE SEQUENCES - APPLICATION OF THE HIDDEN MARKOV-MODELS TO THE ALPHA-CLASS PROTEINS

Citation
V. Difrancesco et al., PROTEIN TOPOLOGY RECOGNITION FROM SECONDARY STRUCTURE SEQUENCES - APPLICATION OF THE HIDDEN MARKOV-MODELS TO THE ALPHA-CLASS PROTEINS, Journal of Molecular Biology, 267(2), 1997, pp. 446-463
Citations number
62
Categorie Soggetti
Biology
ISSN journal
00222836
Volume
267
Issue
2
Year of publication
1997
Pages
446 - 463
Database
ISI
SICI code
0022-2836(1997)267:2<446:PTRFSS>2.0.ZU;2-Q
Abstract
The three-dimensional fold of a protein is described by the organizati on of its secondary structure elements in 3D space, i.e. its ''topolog y''. We find that the protein topology can be recognized from the 1D s equence of secondary structure states of the residues alone. Automated recognition is facilitated by use of hidden Markov models (HMMs) to r epresent topology families of proteins. Such models can be trained on the experimentally observed secondary structure sequences of family me mbers using well established algorithms. Here, we model various topolo gy groups in the alpha class of proteins and identify, from a large da tabase, those proteins having the topology described by each model. Th e correct topology family for protein secondary structure sequences co uld be recognized 12 out of 14 times. When the observed secondary stru cture sequences are replaced with predicted sequences recognition is s till achievable 8 out of 14 times. The success rate for observed seque nces indicates that our approach will become increasingly useful as th e accuracy of secondary prediction algorithms is improved. Our study i ndicates that the HMMs are useful for protein topology recognition eve n when no detectable primary amino acid sequence similarity is present . To illustrate the potential utility of our method, protein topology recognition is attempted on leptin, the obese gene product, and the hu man interleukin-6 sequence, for which fold predictions have been previ ously published. (C) 1997 Academic Press Limited.