Jo. Wrabl et al., Thermodynamic propensities of amino acids in the native state ensemble: Implications for fold recognition, PROTEIN SCI, 10(5), 2001, pp. 1032-1045
An amino acid sequence, in the context of the solvent environment, contains
all of the thermodynamic information necessary to encode a three-dimension
al protein structure. To investigate the relationship between an amino acid
sequence and its corresponding protein fold, a database of thermodynamic s
tability information was assembled that spanned 2951 residues from 44 nonho
mologous proteins. This information was obtained using the COREX algorithm,
which computes an ensemble-based description of the native state of a prot
ein. It was observed that amino acid types partitioned unequally into high,
medium, and low thermodynamic stability environments. Furthermore, these d
istributions were reproducible and were significantly different than those
expected from random partitioning. To assess the structural importance of t
he distributions, simple fold-recognition experiments were performed based
on a 3D-1D scoring matrix containing only COREX residue stability informati
on. This procedure was able to recover amino acid sequences corresponding t
o correct target structures more effectively than scoring matrices derived
from randomized data. High-scoring sequences were often aligned correctly w
ith their corresponding target profiles, suggesting that calculated thermod
ynamic stability profiles have the potential to encode sequence information
. As a control, identical fold-recognition experiments were performed on th
e same database of proteins using DSSP secondary structure information in t
he scoring matrix, instead of COREX residue stability information. The comp
arable performance of both approaches suggested that COREX residue stabilit
y information and secondary structure information could be of equivalent ut
ility in more sophisticated fold-recognition techniques. The results of thi
s work are a consequence of the idea that amino acid sequences fold not int
o single, rigidly stable structures but rather into thermodynamic ensembles
best represented by a time averaged structure.