DIVERSE INCIDENCES OF INDIVIDUAL OLIGOPEPTIDES (DIPEPTIDIC TO HEXAPEPTIDIC) IN PROTEINS OF HUMAN, BAKERS-YEAST, AND ESCHERICHIA-COLI ORIGINREGISTERED IN THE SWISS-PROT DATA-BASE

Citation
H. Doi et al., DIVERSE INCIDENCES OF INDIVIDUAL OLIGOPEPTIDES (DIPEPTIDIC TO HEXAPEPTIDIC) IN PROTEINS OF HUMAN, BAKERS-YEAST, AND ESCHERICHIA-COLI ORIGINREGISTERED IN THE SWISS-PROT DATA-BASE, Proceedings of the National Academy of Sciences of the United Statesof America, 92(7), 1995, pp. 2879-2883
Citations number
4
Categorie Soggetti
Multidisciplinary Sciences
ISSN journal
00278424
Volume
92
Issue
7
Year of publication
1995
Pages
2879 - 2883
Database
ISI
SICI code
0027-8424(1995)92:7<2879:DIOIO(>2.0.ZU;2-M
Abstract
Oligopeptidic permutations of the 20 amino acid residues give rise to proteins of diverse functions. Our long-term goal is to produce a lexi con of oligopeptides, classifying them into at least five categories: (i) ubiquitous, (ii) function specific, (iii) group specific, (iv) spe cies specific, and (v) nonexistent. To begin with, we report on the va rying frequencies of individual oligopeptides (dipeptidic to hexapepti dic in length) found among 2862 human proteins, 1942 Saccharomyces cer evisiae proteins, and 2672 Escherichia coli proteins registered in the Swiss-Prot data base (version 29.0, released in June 1994). At all le ngths (dipeptides to hexapeptides), homooligopeptides were very promin ent among the most frequently occurring varieties in proteins of human and bakers' yeast origins. However, this was not the case with E. col i. While all of the expected 20(3) varieties of tripeptides were found among human proteins, three tripeptides (Cys-Cys-Trp, Trp Trp Cys, an d Trp-Trp-His) were missing from the bakers' yeast proteins. Three tri peptides (Cys-Ile-Trp, Cys-Met-Tyr, and Cys-Trp-Trp) were also absent from E. coil proteins. Inasmuch as the Swiss-Prot data base already co ntained 67% of the expected total of 4000 E. coli proteins, it is virt ually certain that 96,000 varieties of hexapeptides containing at leas t one or another of the three missing tripeptides noted above shalt be nonexistent in E. coli. Furthermore, the observation of missing tripe ptides in the bakers' yeast proteins suggests that nonexistent hexapep tides shall be highly phylum specific. Because of the sample size, onl y a small fraction of the 20(6) varieties of hexapeptides were expecte d to be encountered in the present survey. Indeed, only 1.21.5% of the possible hexapeptides were found, and the average copy number of obse rved hexapeptides varied between 1.06 and 1.25. Nevertheless, 33 varie ties of hexapeptides occurred in 102-169 copies among human proteins. Furthermore, 15 of the 33 varieties contained such rarely used residue s as Tyr, His, Cys, and Trp.