Gh. Gonnet et al., ANALYSIS OF AMINO-ACID SUBSTITUTION DURING DIVERGENT EVOLUTION - THE 400 BY 400 DIPEPTIDE SUBSTITUTION MATRIX, Biochemical and biophysical research communications, 199(2), 1994, pp. 489-496
Most formal methods for analyzing the divergent evolution of protein s
equences assume a Markov model where position i in a polypeptide chain
undergoes amino acid substitution independently from position i+1. Th
e large number of aligned homologous sequence pairs available from the
exhaustive matching of the protein sequence database makes it possibl
e to examine this assumption empirically. We have constructed a 400 by
400 matrix that reports empirical probabilities for the interconversi
on of all pairs of dipeptides in proteins undergoing divergent evoluti
on. Comparison of these probabilities with those expected if substitut
ion at adjacent positions in a protein sequence were independent revea
ls interesting patterns that arise through the breakdown of this assum
ption. Several of these are useful in extracting conformational inform
ation from patterns of conservation and variation in homologous protei
n sequences. (C) 1994 Academic Press, Inc.