Dipeptide frequency/bias analysis identifies conserved sites of nonrandomness shared by cysteine-rich motifs

Citation
Sr. Campion et al., Dipeptide frequency/bias analysis identifies conserved sites of nonrandomness shared by cysteine-rich motifs, PROTEINS, 44(3), 2001, pp. 321-328
Citations number
24
Categorie Soggetti
Biochemistry & Biophysics
Journal title
PROTEINS-STRUCTURE FUNCTION AND GENETICS
ISSN journal
08873585 → ACNP
Volume
44
Issue
3
Year of publication
2001
Pages
321 - 328
Database
ISI
SICI code
0887-3585(20010815)44:3<321:DFAICS>2.0.ZU;2-A
Abstract
This report describes the application of a simple computational tool, AAPAI R.TAB, for the systematic analysis of the cysteine-rich EGF, Sushi, and Lam inin motif/sequence families at the two-amino acid level. Automated dipepti de frequency/bias analysis detects preferences in the distribution of amino acids in established protein families, by determining which "ordered dipep tides" occur most frequently in comprehensive motif-specific sequence data sets. Graphic display of the dipeptide frequency/bias data revealed family- specific preferences for certain dipeptides, but more importantly detected a shared preference for employment of the ordered dipeptides Gly-Tyr (GY) a nd Gly-Phe (GF) in all three protein families. The dipeptide Asn-Gly (NG) a lso exhibited high-frequency and bias in the EGF and Sushi motif families, whereas Asn-Thr (NT) was distinguished in the Laminin family. Evaluation of the distribution of dipeptides identified by frequency/bias analysis subse quently revealed the highly restricted localization of the G(F/Y) and N(G/T ) sequence elements at two separate sites of extreme conservation in the co nsensus sequence of all three sequence families. The similar employment of the high-frequency/bias dipeptides in three distinct protein sequence famil ies was further correlated with the concurrence of these shared molecular d eterminants at similar positions within the distinctive scaffolds of three structurally divergent, but similarly employed, motif modules. (C) 2001 Wil ey-Liss, Inc.