Rh. Lathrop et Tf. Smith, GLOBAL OPTIMUM PROTEIN THREADING WITH GAPPED ALIGNMENT AND EMPIRICAL PAIR SCORE FUNCTIONS, Journal of Molecular Biology, 255(4), 1996, pp. 641-665
We describe a branch-and-bound search algorithm for finding the exact
global optimum gapped sequence-structure alignment (''threading'') bet
ween a protein sequence and a protein core or structural model, using
an arbitrary amino acid pair score function (e.g. contact potentials,
knowledge-based potentials, potentials of mean force, etc.). The searc
h method imposes minimal conditions on how structural environments are
defined or the form of the score function, and allows arbitrary seque
nce-specific functions for scoring loops and active site residues. Con
sequently the search method can be used with many different score func
tions and threading methodologies; this paper illustrates five from th
e literature. On a desktop workstation running LISP, we have found the
global optimum protein sequence-structure alignment in NP-hard search
spaces as large as 9.6 x 10(31), at rates ranging as high as 6.8 x 10
(28) equivalent threadings per second (most of which are pruned before
they ever are examined explicitly). Continuing the procedure past the
global optimum enumerates successive candidate threadings in monotoni
cally increasing score order. We give efficient algorithms for search
space size, uniform random sampling, segment placement probabilities,
mean, standard deviation and partition function. The method should pro
ve useful for structure prediction, as well as for critical evaluation
of new pair score functions. (C) 1996 Academic Press Limited