A. Poupon et Jp. Mornon, Deciphering globular protein sequence-structure relationships: from observation to prediction, THEOR CH AC, 106(1-2), 2001, pp. 113-120
Careful comparison of proteins sharing a same fold but only low or no seque
nce identity should allow a better understanding of the coding of three-dim
ensional structures by amino acid sequences. It has already been shown that
positions of a given fold occupied mainly by hydrophobic residues in the d
ifferent proteins of a structural family share very specific physical prope
rties and participate in stabilization of the protein domain. They probably
also Flay a crucial role in the very first steps of folding [ Poupon A, Mo
rnon J.P (1999) FEES Lett. 452: 283-289; Mirny LA, Shaknovich EI (1999) J.
Mel. Biol. 291: 177-196]. To further understand the sequence-structure rela
tionship, we studied the correlation between allowed mutations at a given t
hree-dimensional position and some of its physical properties. The differen
t amino acids were divided in three groups (hydrophobic, nonpolar or weakly
polar and polar or charged), and a correlation was established between the
occupation rate of each group at a given position in the fold and the bury
ing, the side-chain dispersion, the interposition distances and the ability
to form a network of directly interacting residues. The results are then a
pplied to predict some solvent accessibility. We show that this property ca
n be accurately predicted for about 70% of the residues, providing precious
information concerning the corresponding three-dimensional structures. The
results are used to predict other structural features, as secondary struct
ures, compactness or long-range interactions between residues remote in seq
uence. This information will allow the number of possible structures for a
given sequence to be reduced considerably, simplifying the ab initio modell
ing problem to a level where it might be solved by computing methods.