AN EFFICIENT COMPUTATIONAL METHOD FOR GLOBALLY OPTIMAL THREADING

Citation
Y. Xu et al., AN EFFICIENT COMPUTATIONAL METHOD FOR GLOBALLY OPTIMAL THREADING, Journal of computational biology, 5(3), 1998, pp. 597-614
Citations number
29
Categorie Soggetti
Mathematics,Biology,"Biochemical Research Methods",Mathematics,"Biothechnology & Applied Migrobiology
ISSN journal
10665277
Volume
5
Issue
3
Year of publication
1998
Pages
597 - 614
Database
ISI
SICI code
1066-5277(1998)5:3<597:AECMFG>2.0.ZU;2-A
Abstract
Computational recognition of native-like folds of an anonymous amino a cid sequence from a protein fold database is considered to be a promis ing approach to the three-dimensional (3D) fold prediction of the amin o acid sequence. We present a new method for protein fold recognition through optimally aligning an amino acid sequence and a protein fold t emplate (protein threading). The fitness of aligning an amino acid seq uence with a fold template is measured by (1) the singleton fitness, r epresenting the compatibility of substituting one amino acid by anothe r and the combined preference of secondary structure and solvent acces sibility for a particular amino acid, (2) the pair,vise interaction, r epresenting the contact preference between a pair of amino acids, and (3) alignment gap penalties. Though a protein threading problem so def ined is known to be NP-hard in the most general sense, our algorithm r uns efficiently if we place a cutoff distance on the pairwise interact ions, as many of the existing threading programs do. For an amino acid sequence of size n and a fold template of size m with M core secondar y structures, the algorithm finds an optimal alignment in O (Mn1.5C+1 + mn(C+1)) time and O (MnC+1) space, where C is a (small) nonnegative integer, determined by a particular mathematical property of the pairw ise interactions. As a case study, we have demonstrated that C is less than or equal to 4 for about 75% of the 293 unique folds in our prote in database, when pairwise interactions are restricted to amino acids less than or equal to 7 Angstrom apart (measured between their beta ca rbon atoms). An approximation scheme is developed for fold templates w ith C > 4, when threading requires too much memory and time to be prac tical on a typical workstation.