MODELING PROTEIN CORES WITH MARKOV RANDOM-FIELDS

Citation
Jv. White et al., MODELING PROTEIN CORES WITH MARKOV RANDOM-FIELDS, Mathematical biosciences, 124(2), 1994, pp. 149-179
Citations number
33
Categorie Soggetti
Mathematical Methods, Biology & Medicine","Mathematics, Miscellaneous","Biology Miscellaneous
Journal title
ISSN journal
00255564
Volume
124
Issue
2
Year of publication
1994
Pages
149 - 179
Database
ISI
SICI code
0025-5564(1994)124:2<149:MPCWMR>2.0.ZU;2-2
Abstract
A mathematical formalism is introduced that has general applicability to many protein structure models used in the various approaches to the ''inverse protein folding problem.'' The inverse nature of the proble m arises from the fact that one begins with a set of assumed tertiary structures and searches for those most compatible with a new sequence, rather than attempting to predict the structure directly from the new sequence. The formalism is based on the well-known theory of Markov r andom fields (MRFs). Our MRF formulation provides explicit representat ions for the relevant amino acid position environments and the physica l topologies of the structural contacts. In particular, MRF models can readily be constructed for the secondary structure packing topologies found in protein domain cores, or other structural motifs, that are a nticipated to be common among large sets of both homologous and nonhom ologous proteins. MRF models are probabilistic and can exploit the sta tistical data from the limited number of proteins having known domain structures. The MRF approach leads to a new scoring function for compa ring different threadings (placements) of a sequence through different structure models. The scoring function is very important, because com paring alternative structure models with each other is a key step in t he inverse folding problem. Unlike previously published scoring functi ons, the one derived in this paper is based on a comprehensive probabi listic formulation of the threading problem.