PREDICTION OF O-GLYCOSYLATION OF MAMMALIAN PROTEINS - SPECIFICITY PATTERNS OF UDP-GALNAC-POLYPEPTIDE N-ACETYLGALACTOSAMINYLTRANSFERASE

Citation
Je. Hansen et al., PREDICTION OF O-GLYCOSYLATION OF MAMMALIAN PROTEINS - SPECIFICITY PATTERNS OF UDP-GALNAC-POLYPEPTIDE N-ACETYLGALACTOSAMINYLTRANSFERASE, Biochemical journal, 308, 1995, pp. 801-813
Citations number
104
Categorie Soggetti
Biology
Journal title
ISSN journal
02646021
Volume
308
Year of publication
1995
Part
3
Pages
801 - 813
Database
ISI
SICI code
0264-6021(1995)308:<801:POOOMP>2.0.ZU;2-0
Abstract
The specificity of the enzyme(s) catalysing the covalent link between the hydroxyl side chains of serine or threonine and the sugar moiety N -acetylgalactosamine (GalNAc) is unknown. Pattern recognition by artif icial neural networks and weight matrix algorithms was performed to de termine the exact position of in vivo O-linked GalNAc-glycosylated ser ine and threonine residues from the primary sequence exclusively. The acceptor sequence context for O-glycosylation of serine was found to d iffer from that of threonine and the two types were therefore treated separately. The context of the sites showed a high abundance of prolin e, serine and threonine extending far beyond the previously reported r egion covering positions -4 through +4 relative to the glycosylated re sidue. The O-glycosylation sites were found to cluster and to have a h igh abundance in the N-terminal part of the protein. The sites were al so found to have an increased preference for three different classes o f beta-turns. No simple consensus-like rule could be deduced for the c omplex glycosylation sequence acceptor patterns. The neural networks w ere trained on the hitherto largest data material consisting of 48 car efully examined mammalian glycoproteins comprising 264 O-glycosylation sites. For detection neural network algorithms were much more reliabl e than weight matrices. The networks correctly found 60-95% of the O-g lycosylated serine/threonine residues and 88-97% of the non-glycosylat ed residues in two independent test sets of known glycoproteins. A com puter server using E-mail for prediction of O-glycosylation sites has been implemented and made publicly available. The Internet address is NetOglyc@cbs.dtu.dk.