ITA
ENG

MODELING PROTEIN CORES WITH MARKOV RANDOM-FIELDS

Authors

WHITE JV MUCHNIK I SMITH TF

Citation

Jv. White et al., MODELING PROTEIN CORES WITH MARKOV RANDOM-FIELDS, Mathematical biosciences, 124(2), 1994, pp. 149-179

Citations number

Categorie Soggetti

Mathematical Methods, Biology & Medicine","Mathematics, Miscellaneous","Biology Miscellaneous

Journal title

Mathematical biosciences → ACNP

ISSN journal

00255564

Volume

124

Issue

Year of publication

1994

Pages

149 - 179

Database

ISI

SICI code

0025-5564(1994)124:2<149:MPCWMR>2.0.ZU;2-2

Abstract

A mathematical formalism is introduced that has general applicability to many protein structure models used in the various approaches to the ''inverse protein folding problem.'' The inverse nature of the proble m arises from the fact that one begins with a set of assumed tertiary structures and searches for those most compatible with a new sequence, rather than attempting to predict the structure directly from the new sequence. The formalism is based on the well-known theory of Markov r andom fields (MRFs). Our MRF formulation provides explicit representat ions for the relevant amino acid position environments and the physica l topologies of the structural contacts. In particular, MRF models can readily be constructed for the secondary structure packing topologies found in protein domain cores, or other structural motifs, that are a nticipated to be common among large sets of both homologous and nonhom ologous proteins. MRF models are probabilistic and can exploit the sta tistical data from the limited number of proteins having known domain structures. The MRF approach leads to a new scoring function for compa ring different threadings (placements) of a sequence through different structure models. The scoring function is very important, because com paring alternative structure models with each other is a key step in t he inverse folding problem. Unlike previously published scoring functi ons, the one derived in this paper is based on a comprehensive probabi listic formulation of the threading problem.