On the frequency of protein glycosylation, as deduced from analysis of theSWISS-PROT database

Citation
R. Apweiler et al., On the frequency of protein glycosylation, as deduced from analysis of theSWISS-PROT database, BBA-GEN SUB, 1473(1), 1999, pp. 4-8
Citations number
7
Categorie Soggetti
Biochemistry & Biophysics
Journal title
BIOCHIMICA ET BIOPHYSICA ACTA-GENERAL SUBJECTS
ISSN journal
03044165 → ACNP
Volume
1473
Issue
1
Year of publication
1999
Pages
4 - 8
Database
ISI
SICI code
0304-4165(199912)1473:1<4:OTFOPG>2.0.ZU;2-0
Abstract
The SWISS-PROT protein sequence data bank contains at present nearly 75000 entries, almost two thirds of which include the potential N-glycosylation c onsensus sequence, or sequon, NXS/T (where X call be any amino acid but pro line) and thus may be glycoproteins. The number of proteins filed as glycop roteins is however considerably smaller, 7942, of which 749 have been chara cterized with respect to the total number of their carbohydrate units and s ites of attachment of the latter to the protein, as well as the nature of t he carbohydrate-peptide linking group. Of these well characterized glycopro teins, about 90% carry either N-linked carbohydrate units alone or both N- and O-linked ones, attached at 1297 N-glycosylation sites (1.9 per glycopro tein molecule) and the rest are O-glycosylated only. Since the total number of sequons in the well characterized glycoproteins is 1968, their rate of occupancy is 2/3. Assuming that the same number of N-linked units and rate of sequon occupancy occur in all sequon containing proteins and that the pr oportion of solely O-glycosylated proteins (ca. 10%) will also be the same as among the well characterized ones, we conclude that the majority of sequ on containing proteins will be found to be glycosylated and that more than half of all proteins are glycoproteins. (C) 1999 Elsevier Science B.V. All rights reserved.