Thermodynamic propensities of amino acids in the native state ensemble: Implications for fold recognition

Citation
Jo. Wrabl et al., Thermodynamic propensities of amino acids in the native state ensemble: Implications for fold recognition, PROTEIN SCI, 10(5), 2001, pp. 1032-1045
Citations number
37
Categorie Soggetti
Biochemistry & Biophysics
Journal title
PROTEIN SCIENCE
ISSN journal
09618368 → ACNP
Volume
10
Issue
5
Year of publication
2001
Pages
1032 - 1045
Database
ISI
SICI code
0961-8368(200105)10:5<1032:TPOAAI>2.0.ZU;2-K
Abstract
An amino acid sequence, in the context of the solvent environment, contains all of the thermodynamic information necessary to encode a three-dimension al protein structure. To investigate the relationship between an amino acid sequence and its corresponding protein fold, a database of thermodynamic s tability information was assembled that spanned 2951 residues from 44 nonho mologous proteins. This information was obtained using the COREX algorithm, which computes an ensemble-based description of the native state of a prot ein. It was observed that amino acid types partitioned unequally into high, medium, and low thermodynamic stability environments. Furthermore, these d istributions were reproducible and were significantly different than those expected from random partitioning. To assess the structural importance of t he distributions, simple fold-recognition experiments were performed based on a 3D-1D scoring matrix containing only COREX residue stability informati on. This procedure was able to recover amino acid sequences corresponding t o correct target structures more effectively than scoring matrices derived from randomized data. High-scoring sequences were often aligned correctly w ith their corresponding target profiles, suggesting that calculated thermod ynamic stability profiles have the potential to encode sequence information . As a control, identical fold-recognition experiments were performed on th e same database of proteins using DSSP secondary structure information in t he scoring matrix, instead of COREX residue stability information. The comp arable performance of both approaches suggested that COREX residue stabilit y information and secondary structure information could be of equivalent ut ility in more sophisticated fold-recognition techniques. The results of thi s work are a consequence of the idea that amino acid sequences fold not int o single, rigidly stable structures but rather into thermodynamic ensembles best represented by a time averaged structure.