Ba. Reva et al., WHAT IS THE PROBABILITY OF A CHANCE PREDICTION OF A PROTEIN-STRUCTUREWITH AN RMSD OF 6 ANGSTROM, Folding & design, 3(2), 1998, pp. 141-147
Background: The root mean square deviation (rmsd) between correspondin
g atoms of two protein chains is a commonly used measure of similarity
between two protein structures, The smaller the rmsd is between two s
tructures, the more similar are these two structures. In protein struc
ture prediction, one needs the rmsd between predicted and experimental
structures for which a prediction can be considered to be successful.
Success is obvious only when the rmsd is as small as that for closely
homologous proteins (< 3 Angstrom). To estimate the quality of the pr
ediction in the more general case, one has to compare the native struc
ture not only with the predicted one but also with randomly chosen pro
tein-like folds, One can ask: how many such structures must be conside
red to find a structure with a given rmsd from the native structure? R
esults: We calculated the rmsd values between native structures of 142
proteins and all compact structures obtained in the threading of thes
e protein chains over 364 non-homologous structures, The rmsd distribu
tions have a Gaussian form, with the average rmsd approximately propor
tional to the radius of gyration. Conclusions: We estimated the number
of protein-like structures required to obtain a structure within an r
msd of 6 Angstrom to be 10(4)-10(5) for chains of 60-80 residues and 1
0(11)-10(12) structures for chains of 160-200 residues. The probabilit
y of obtaining a 6 Angstrom rmsd by chance is so remote that when such
structures are obtained from a prediction algorithm, it should be con
sidered quite successful.