SECONDARY STRUCTURE PREDICTION USING SEGMENT SIMILARITY

Citation
L. Rychlewski et A. Godzik, SECONDARY STRUCTURE PREDICTION USING SEGMENT SIMILARITY, Protein engineering, 10(10), 1997, pp. 1143-1153
Citations number
39
Categorie Soggetti
Biothechnology & Applied Migrobiology",Biology
Journal title
ISSN journal
02692139
Volume
10
Issue
10
Year of publication
1997
Pages
1143 - 1153
Database
ISI
SICI code
0269-2139(1997)10:10<1143:SSPUSS>2.0.ZU;2-N
Abstract
We present a secondary structure prediction method based on finding si milarities between sequence segments from the target sequence and segm ents contained in the database of proteins with known structures, The similarity definition is optimized using a genetic algorithm and is ba sed on a 21 x 40 similarity matrix, comparing a target sequence with t he sequence and burial status of the proteins from the database. The t hree-state secondary structure prediction accuracy reaches 72.4% on a non homologous (maximum sequence identity <25%) data set derived from PDB and is reproduced on two independent testing sets, including the s et of CASP2 prediction targets and a group of newly solved PDB structu res. The prediction method was developed with simplicity and open arch itecture in mind, allowing for an easy extension to other types of pre dictions and to the analysis of the contributions to the local structu re formation. For instance, the design of the prediction procedure all ows us to trace back segments of the database that contributed to the prediction. It can be shown that those segments came from various stru ctural classes and that even complete exclusion of related folds from the database does not result in a significant decrease in prediction a ccuracy.