Ar. Ortiz et al., FOLD ASSEMBLY OF SMALL PROTEINS USING MONTE-CARLO SIMULATIONS DRIVEN BY RESTRAINTS DERIVED FROM MULTIPLE SEQUENCE ALIGNMENTS, Journal of Molecular Biology, 277(2), 1998, pp. 419-448
The feasibility of predicting the global fold of small proteins by inc
orporating predicted secondary and tertiary restraints into ab initio
folding simulations has been demonstrated on a test set comprised of 2
0 nonhomologous proteins, of which one was a blind prediction of targe
t 42 in the recent CASP2 contest. These proteins contain from 37 to 10
0 residues and represent all secondary structural classes and a repres
entative variety of global topologies. Secondary structure restraints
are provided by the PHD secondary structure prediction algorithm that
incorporates multiple sequence information. Predicted tertiary restrai
nts are derived from multiple sequence alignments via a two-step proce
ss. First, seed side-chain contacts are identified from correlated mut
ation analysis, and then a threading-based algorithm is used to expand
the number of these seed contacts. A lattice-based reduced protein mo
del and a folding algorithm designed to incorporate these predicted re
straints is described. Depending upon fold complexity, it is possible
to assemble native-like topologies whose coordinate root-mean-square d
eviation from native is between 3.0 Angstrom and 6.5 Angstrom. The req
uisite level of accuracy in side-chain contact may prediction can be r
oughly 25% on average, provided that about 60% of the contact predicti
ons are correct within +/-1 residue and 95% of the predictions are cor
rect within +/-4 residues. Precision in tertiary contact prediction is
more critical than absolute accuracy. Furthermore, only a subset of t
he tertiary contacts, on the order of 25% of the total, is sufficient
for successful topology assembly. Overall, this study suggests that th
e use of restraints derived from multiple sequence alignments combined
with a fold assembly algorithm holds considerable promise for the pre
diction of the global topology of small proteins. (C) 1998 Academic Pr
ess Limited.