A. Irback et al., EVIDENCE FOR NONRANDOM HYDROPHOBICITY STRUCTURES IN PROTEIN CHAINS, Proceedings of the National Academy of Sciences of the United Statesof America, 93(18), 1996, pp. 9533-9538
The question of whether proteins originate from random sequences of am
ino acids is addressed, A statistical analysis is performed in terms o
f blocked and random walk values formed by binary hydrophobic assignme
nts of the amino acids along the protein chains, Theoretical expectati
ons of these variables from random distributions of hydrophobicities a
re compared with those obtained from functional proteins, The results,
which are based upon proteins in the SWISS-PROT data base, convincing
ly show that the amino acid sequences in proteins differ from what is
expected from random sequences in a statistically significant way, By
performing Fourier transforms on the random walks, one obtains additio
nal evidence for nonrandomness of the distributions, We have also anal
yzed results from a synthetic model containing only two amino acid typ
es, hydrophobic and hydrophilic. With reasonable criteria on good fold
ing properties in terms of thermodynamical and kinetic behavior, seque
nces that fold well are isolated. Performing the same statistical anal
ysis on the sequences that fold well indicates similar deviations from
randomness as for the functional proteins, The deviations from random
ness can be interpreted as originating from anticorrelations in terms
of an Ising spin model for the hydrophobicities. Our results, which di
ffer from some previous investigations using other methods, might have
impact on how permissive with respect to sequence specificity the pro
tein folding process is-only sequences with nonrandom hydrophobicity d
istributions fold well, Other distributions give rise to energy landsc
apes with poor folding properties and hence did not survive the evolut
ion.