Comparative protein structure prediction is limited mostly by the errors in
alignment and loop modeling. We describe here a new automated modeling tec
hnique that significantly improves the accuracy of loop predictions in prot
ein structures. The positions of all nonhydrogen atoms of the loop are opti
mized in a fixed environment with respect to a pseudo energy function. The
energy is a sum of many spatial restraints that include the bond length, bo
nd angle, and improper dihedral angle terms from the CHARMM-22 force field,
statistical preferences for the main-chain and side-chain dihedral angles,
and statistical preferences for nonbonded atomic contacts chat depend on t
he two atom types, their distance through space, and separation in sequence
. The energy function is optimized with the method of conjugate gradients c
ombined with molecular dynamics and simulated annealing. Typically, the pre
dicted loop conformation corresponds to the lowest energy conformation amon
g 500 independent optimizations. Predictions were made for 40 loops of know
n structure at each length from 1 to 14 residues. The accuracy of loop pred
ictions is evaluated as a Function of thoroughness of conformational sampli
ng, loop length, and structural properties of native loops. When accuracy i
s measured by local superposition of the model on the native loop, 100, 90,
and 30% of 4-, 8-, and 12-residue loop predictions, respectively, had <2 A
ngstrom RMSD error for the mainchain N, C-alpha, C, and O atoms; the averag
e accuracies were 0.59 +/- 0.05, 1.16 +/- 0.10, and 2.61 +/- 0.16 Angstrom,
respectively. To simulate real comparative modeling problems, the method w
as also evaluated by predicting loops of known structure in only approximat
ely correct environments with errors typical of comparative modeling withou
t misalignment. When the RMSD distortion of the main-chain stem atoms is 2.
5 Angstrom, the average loop prediction error increased by 18, 25, and 3% f
or 4-, 8-, and 12-residue loops, respectively. The accuracy of the lowest e
nergy prediction for a given loop can be estimated from the structural vari
ability among a number of low energy predictions. The relative value of the
present method is gauged by (1) comparing it with one of the most successf
ul previously described methods, and (2) describing its accuracy in recent
blind predictions of protein structure. Finally, it is shown that the avera
ge accuracy of prediction is limited primarily by the accuracy of the energ
y function rather than by the extent of conformational sampling.