Motivation: Until ab initio structure prediction methods are perfected the
estimation of structure for protein molecules will depend on combining mult
iple sources of experimental and theoretical data. Secondary structure pred
ictions are a particularly useful source of structural information, but are
currently only similar to 70% correct, on average. Structure computation a
lgorithms which incorporate secondary structure information must therefore
have methods for dealing with predictions that are imperfect.
Experiments performed: We have modified our algorithm for probabilistic lea
st squares structural computations to accept 'disjunctive' constraints, in
which a constraint is provided as a set of possible values, each weighted w
ith a probability. Thus, when a helix is predicted, the distances associate
d with a helix are given most of the weight but some weights can be allocat
ed to the other possibilities (strand and coil). We have tested a variety o
f strategies for this weighting scheme in conjunction with a baseline synth
etic set of sparse distance data, and compared it with strategies which do
not use disjunctive constraints.
Results: Naive interpretations in which predictions were taken as 100% corr
ect led to poor-quality structures. Interpretations that allow disjunctive
constraints are quite robust and even relatively poor predictions (58% corr
ect) can significantly increase the quality of computed structures (almost
halving the RMS error from the known structure).
Conclusions: Secondary structure predictions can be used to improve the qua
lity of three-dimensional structural computations. In fact, when interprete
d appropriately, imperfect predictions can provide almost as much improveme
nt as perfect predictions in three-dimensional structure calculations.
Contact: rba@smi.stanford.edu.