We propose a self-consistent approach to analyze knowledge-based atom-atom
potentials used to calculate protein-ligand binding energies. Ligands compl
exed to actual protein structures were first built using the SMoG growth pr
ocedure (DeWitte & Shakhnovich, 1996) with a chosen input potential. These
model protein-ligand complexes were used to construct databases from which
knowledge-based protein-ligand potentials were derived. We then tested seve
ral different modifications to such potentials and evaluated their performa
nce on their ability to reconstruct the input potential using the statistic
al information available from a database composed of model complexes. Our d
ata indicate that the most significant improvement resulted from properly a
ccounting for the following key issues when estimating the reference state:
(1) the presence of significant nonenergetic effects that influence the co
ntact frequencies and (2) the presence of correlations in contact patterns
due to chemical structure. The most successful procedure was applied to der
ive an atom-atom potential for real protein-ligand complexes. Despite the s
implicity of the model (pairwise contact potential with a single interactio
n distance), the derived binding free energies showed a statistically signi
ficant correlation (similar to 0.65) with experimental binding scores for a
diverse set of complexes.