RECOGNITION OF ANALOGOUS AND HOMOLOGOUS PROTEIN FOLDS - ANALYSIS OF SEQUENCE AND STRUCTURE CONSERVATION

Citation
Rb. Russell et al., RECOGNITION OF ANALOGOUS AND HOMOLOGOUS PROTEIN FOLDS - ANALYSIS OF SEQUENCE AND STRUCTURE CONSERVATION, Journal of Molecular Biology, 269(3), 1997, pp. 423-439
Citations number
49
Categorie Soggetti
Biology
ISSN journal
00222836
Volume
269
Issue
3
Year of publication
1997
Pages
423 - 439
Database
ISI
SICI code
0022-2836(1997)269:3<423:ROAAHP>2.0.ZU;2-N
Abstract
An analysis was performed on 335 pairs of structurally aligned protein s derived from the structural classification of proteins (SCOP http:// scop.mrc-lmb.cam.ac.uk/scop/) database. These similarities were divide d into analogues, defined as proteins with similar three-dimensional s tructures (same SCOP fold classification) but generally with different functions and little evidence of a common ancestor (different SCOP su perfamily classification). Homologues were defined as pairs of similar structures likely to be the result of evolutionary divergence (same s uperfamily) and were divided into remote, medium and close sub-divisio ns based on the percentage sequence identity. Particular attention was paid to the differences between analogues and remote homologues, sinc e both types of similarities are generally undetectable by sequence co mparison and their detection is the aim of fold recognition methods. D istributions of sequence identities and substitution matrices suggest a higher degree of sequence similarity in remote homologues than in an alogues. Matrices for remote homologues show similarity to existing mu tation matrices, providing some validity for their use in previously d escribed fold recognition methods. Ln contrast, matrices derived from analogous proteins show little conservation of amino acid properties b eyond broad conservation of hydrophobic or polar character. Secondary structure and accessibility were more conserved on average in remote h omologues than in analogues, though there was no apparent difference i n the root-mean-square deviation between these two types of similariti es. Alignments of remote homologues and analogues show a similar numbe r of gaps, openings (one or more sequential gaps) and inserted/deleted secondary structure elements, and both generally contain more gaps/op enings/deleted secondary structure elements than medium and close homo logues. These results suggest that gap parameters for fold recognition should be more lenient than those used in sequence comparison. Parame ters were derived from the analogue and remote homologue datasets for potential used in fold recognition methods. Implications for protein f old recognition and evolution are discussed. (C) 1997 Academic Press L imited.