A method is presented for the derivation of knowledge-based pair potentials
that corrects for the various compositions of different proteins. The resu
lting statistical pair potential is more specific than that derived from pr
evious approaches as assessed by gapless threading results. Additionally, a
methodology is presented that interpolates between statistical potentials
when no homologous examples to the protein of interest are in the structura
l database used to derive the potential, to a Go-like potential (in which n
ative interactions are favorable and all nonnative interactions are not) wh
en homologous proteins are present, For cases in which no protein exceeds 3
0% sequence identity, pairs of weakly homologous interacting fragments are
employed to enhance the specificity of the potential. In gapless threading,
the mean z score increases from -10.4 for the best statistical pair potent
ial to -12.8 when the local sequence similarity, fragment-based pair potent
ials are used. Examination of the ab initio structure prediction of four re
presentative globular proteins consistently reveals a qualitative improveme
nt in the yield of structures in the 4 to 6 Angstrom rmsd from native range
when the fragment-based pair potential is used relative to that when the q
uasichemical pair potential is employed. This suggests that such protein-sp
ecific potentials provide a significant advantage relative to generic quasi
chemical potentials. Proteins 2000;38:3-16, (C) 2000 Wiley-Liss, Inc.