ITA
ENG

Identification of related proteins with weak sequence identity using secondary structure information

Authors

Geourjon, C Combet, C Blanchet, C Deleage, G

Citation

C. Geourjon et al., Identification of related proteins with weak sequence identity using secondary structure information, PROTEIN SCI, 10(4), 2001, pp. 788-797

Citations number

Categorie Soggetti

Biochemistry & Biophysics

Journal title

PROTEIN SCIENCE

ISSN journal

09618368 → ACNP

Volume

Issue

Year of publication

2001

Pages

788 - 797

Database

ISI

SICI code

0961-8368(200104)10:4<788:IORPWW>2.0.ZU;2-0

Abstract

Molecular modeling of proteins is confronted with the problem of finding ho mologous proteins, especially when few identities remain after the process of molecular evolution. Using even the most recent methods based on sequenc e identity detection, structural relationships are still difficult to estab lish with high reliability. As protein structures are more conserved than s equences, we investigated the possibility of using protein secondary struct ure comparison (observed or predicted structures) to discriminate between r elated and unrelated proteins sequences in the range of 10%-30% sequence id entity. Pairwise comparison of secondary structures have been measured usin g the structural overlap (Sov) parameter. In this article, we show that if the secondary structures likeness is >50%, most of the pairs are structural ly related. Taking into account the secondary structures of proteins that h ave been detected by BLAST, FASTA, or SSEARCH in the noisy region (with hig h E value), we show that distantly related protein sequences (even with <20 % identity) can be still identified. This strategy can be used to identify three-dimensional templates in homology modeling by finding unexpected rela ted proteins and to select proteins for experimental investigation in a str uctural genomic approach, as well as for genome annotation.