Deciphering globular protein sequence-structure relationships: from observation to prediction

Citation
A. Poupon et Jp. Mornon, Deciphering globular protein sequence-structure relationships: from observation to prediction, THEOR CH AC, 106(1-2), 2001, pp. 113-120
Citations number
19
Categorie Soggetti
Physical Chemistry/Chemical Physics
Journal title
THEORETICAL CHEMISTRY ACCOUNTS
ISSN journal
1432881X → ACNP
Volume
106
Issue
1-2
Year of publication
2001
Pages
113 - 120
Database
ISI
SICI code
1432-881X(200106)106:1-2<113:DGPSRF>2.0.ZU;2-I
Abstract
Careful comparison of proteins sharing a same fold but only low or no seque nce identity should allow a better understanding of the coding of three-dim ensional structures by amino acid sequences. It has already been shown that positions of a given fold occupied mainly by hydrophobic residues in the d ifferent proteins of a structural family share very specific physical prope rties and participate in stabilization of the protein domain. They probably also Flay a crucial role in the very first steps of folding [ Poupon A, Mo rnon J.P (1999) FEES Lett. 452: 283-289; Mirny LA, Shaknovich EI (1999) J. Mel. Biol. 291: 177-196]. To further understand the sequence-structure rela tionship, we studied the correlation between allowed mutations at a given t hree-dimensional position and some of its physical properties. The differen t amino acids were divided in three groups (hydrophobic, nonpolar or weakly polar and polar or charged), and a correlation was established between the occupation rate of each group at a given position in the fold and the bury ing, the side-chain dispersion, the interposition distances and the ability to form a network of directly interacting residues. The results are then a pplied to predict some solvent accessibility. We show that this property ca n be accurately predicted for about 70% of the residues, providing precious information concerning the corresponding three-dimensional structures. The results are used to predict other structural features, as secondary struct ures, compactness or long-range interactions between residues remote in seq uence. This information will allow the number of possible structures for a given sequence to be reduced considerably, simplifying the ab initio modell ing problem to a level where it might be solved by computing methods.