Predicting subcellular localization of proteins based on their N-terminal amino acid sequence

Citation
O. Emanuelsson et al., Predicting subcellular localization of proteins based on their N-terminal amino acid sequence, J MOL BIOL, 300(4), 2000, pp. 1005-1016
Citations number
52
Categorie Soggetti
Molecular Biology & Genetics
Journal title
JOURNAL OF MOLECULAR BIOLOGY
ISSN journal
00222836 → ACNP
Volume
300
Issue
4
Year of publication
2000
Pages
1005 - 1016
Database
ISI
SICI code
0022-2836(20000721)300:4<1005:PSLOPB>2.0.ZU;2-3
Abstract
A neural network-based tool, TargetP, for large-scale subcellular location prediction of newly identified proteins has been developed. Using N-termina l sequence information only, it discriminates between proteins destined for the mitochondrion, the chloroplast, the secretory pathway, and "other" loc alizations with a success rate of 85 % (plant) or 90 % (non-plant) on redun dancy-reduced test sets. From a TargetP analysis. of the recently sequenced Arabidopsis thaliana chromosomes 2 and 4 and the Ensembl Homo sapiens prot ein set, we estimate that 10 % of all plant proteins are mitochondrial and 14 % chloroplastic, and that the abundance of secretory proteins, in both A rabidopsis and Homo, is around 10 %. TargetP also predicts cleavage sites w ith levels of correctly predicted,sites ranging from approximately 40 % to 50 % (chloroplastic and mitochondrial presequences) to above 70 % (secretor y signal peptides). TargetP is available as a web-server at http://www.cbs. dtu.dk/services/TargetP/. (C) 2000 Academic Press.