A POLYNOMIAL-TIME ALGORITHM FOR A CLASS OF PROTEIN THREADING PROBLEMS

Citation
Y. Xu et Ec. Uberbacher, A POLYNOMIAL-TIME ALGORITHM FOR A CLASS OF PROTEIN THREADING PROBLEMS, Computer applications in the biosciences, 12(6), 1996, pp. 511-517
Citations number
15
Categorie Soggetti
Mathematical Methods, Biology & Medicine","Computer Sciences, Special Topics","Computer Science Interdisciplinary Applications","Biology Miscellaneous
ISSN journal
02667061
Volume
12
Issue
6
Year of publication
1996
Pages
511 - 517
Database
ISI
SICI code
0266-7061(1996)12:6<511:APAFAC>2.0.ZU;2-U
Abstract
This paper presents an algorithm for constructing an optimal alignment between a three-dimensional protein structure template and an amino a cid sequence. A protein structure template is given as a sequence of a mino acid residue positions in three-dimensional space, along with an array of physical properties attached to each position; these residue positions are sequentially grouped into a series of core secondary str uctures (central helices and beta sheets). In addition to match scores and gap penalties, as in a traditional sequence-sequence alignment pr oblem, the quality of a structure-sequence alignment is also detemined by interaction preferences among amino acids aligned with structure p ositions that are spatially close (we call these 'long-range interacti ons'). Although it is known that constructing such a structure-sequenc e alignment in the most general form is NP-hard, our algorithm runs in polynomial time when restricted to structures with a 'modest' number of long-range amino acid interactions. In the current work, long-range interactions are limited to interactions between amino acids from dif ferent cove secondary structures. Dividing the series of core secondar y structures into two subseries creates a cut set of long-range intera ctions. If we use N, M and C to represent the size of an amino acid se quence, the size of a structure template, and the maximum cut size of long-range interactions respectively, the algorithm finds an optimal s tructure-sequence alignment in O(21(C)NM) time, a polynomial function of N and M when C = O(log(N + M)). When running on structure-sequence alignment problems without long-range intersections, i.e. C = O, the a lgorithm achieves the same asymptotic computational complexity of the Smith-Waterman sequence-sequence alignment algorithm.