Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes

Citation
A. Krogh et al., Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes, J MOL BIOL, 305(3), 2001, pp. 567-580
Citations number
37
Categorie Soggetti
Molecular Biology & Genetics
Journal title
JOURNAL OF MOLECULAR BIOLOGY
ISSN journal
00222836 → ACNP
Volume
305
Issue
3
Year of publication
2001
Pages
567 - 580
Database
ISI
SICI code
0022-2836(20010119)305:3<567:PTPTWA>2.0.ZU;2-Z
Abstract
We describe and validate a new membrane protein topology prediction method, TMHMM, based on a hidden Markov model. We present a detailed analysis of T MHMM's performance, and show that it correctly predicts 97-98% of the trans membrane helices. Additionally, TMHMM can discriminate between soluble and membrane proteins with both specificity and sensitivity better than 99 %, a lthough the accuracy drops when signal peptides are present. This high degr ee of accuracy allowed us to predict reliably integral membrane proteins in a large collection of genomes. Based on these predictions, we estimate tha t 20-30% of all genes in most genomes encode membrane proteins, which is in agreement with previous estimates. We further discovered that proteins wit h N-in-C-in topologies are strongly preferred in all examined organisms, ex cept Caenorhabditis elegans, where the large number of 7TM receptors increa ses the counts for N-out-C-in topologies. We discuss the possible relevance of this finding for our understanding of membrane protein assembly mechani sms. A TMHMM prediction service is available at http://www.cbs.dtu.dk/servi ces/TMHMM/. (C) 2001 Academic Press.