FOLD ASSEMBLY OF SMALL PROTEINS USING MONTE-CARLO SIMULATIONS DRIVEN BY RESTRAINTS DERIVED FROM MULTIPLE SEQUENCE ALIGNMENTS

Citation
Ar. Ortiz et al., FOLD ASSEMBLY OF SMALL PROTEINS USING MONTE-CARLO SIMULATIONS DRIVEN BY RESTRAINTS DERIVED FROM MULTIPLE SEQUENCE ALIGNMENTS, Journal of Molecular Biology, 277(2), 1998, pp. 419-448
Citations number
56
Categorie Soggetti
Biology
ISSN journal
00222836
Volume
277
Issue
2
Year of publication
1998
Pages
419 - 448
Database
ISI
SICI code
0022-2836(1998)277:2<419:FAOSPU>2.0.ZU;2-J
Abstract
The feasibility of predicting the global fold of small proteins by inc orporating predicted secondary and tertiary restraints into ab initio folding simulations has been demonstrated on a test set comprised of 2 0 nonhomologous proteins, of which one was a blind prediction of targe t 42 in the recent CASP2 contest. These proteins contain from 37 to 10 0 residues and represent all secondary structural classes and a repres entative variety of global topologies. Secondary structure restraints are provided by the PHD secondary structure prediction algorithm that incorporates multiple sequence information. Predicted tertiary restrai nts are derived from multiple sequence alignments via a two-step proce ss. First, seed side-chain contacts are identified from correlated mut ation analysis, and then a threading-based algorithm is used to expand the number of these seed contacts. A lattice-based reduced protein mo del and a folding algorithm designed to incorporate these predicted re straints is described. Depending upon fold complexity, it is possible to assemble native-like topologies whose coordinate root-mean-square d eviation from native is between 3.0 Angstrom and 6.5 Angstrom. The req uisite level of accuracy in side-chain contact may prediction can be r oughly 25% on average, provided that about 60% of the contact predicti ons are correct within +/-1 residue and 95% of the predictions are cor rect within +/-4 residues. Precision in tertiary contact prediction is more critical than absolute accuracy. Furthermore, only a subset of t he tertiary contacts, on the order of 25% of the total, is sufficient for successful topology assembly. Overall, this study suggests that th e use of restraints derived from multiple sequence alignments combined with a fold assembly algorithm holds considerable promise for the pre diction of the global topology of small proteins. (C) 1998 Academic Pr ess Limited.