S. Miyazawa et Rl. Jernigan, A NEW SUBSTITUTION MATRIX FOR PROTEIN-SEQUENCE SEARCHES BASED ON CONTACT FREQUENCIES IN PROTEIN STRUCTURES, Protein engineering, 6(3), 1993, pp. 267-278
The instabilities of the native structures of mutant proteins with an
amino acid exchange are estimated by using the contact energy and the
number of contacts for each type of amino acid pair, which were estima
ted from 18 192 residue - residue contacts observed in 42 crystals of
globular proteins. They were then used to evaluate a transition probab
ility matrix of codon substitutions and a log relatedness odds matrix,
which is used as a scoring matrix to measure the similarity between p
rotein sequences. To consider amino acid substitutions in homologous p
roteins, base mutation rates and the effects of the genetic code are a
lso taken into account. The average fitness of an amino acid exchange
is approximated to be proportional to the structural stability of the
mutant protein, which is then approximated by the average energy chang
e of the protein native structure expected for the amino acid exchange
with neglect of the energy change of the denatured state. In global a
nd local homology searches, this scoring matrix tends to yield signifi
cantly higher alignment scores than either the unitary matrix or the g
enetic code matrix, and also may yield higher alignment scores for dis
tantly related protein pairs than MDM78. One of advantages of this sco
ring matrix is that the equilibrium frequencies of codons and also bas
e mutation rates can be adjusted.