K. Tomii et M. Kanehisa, ANALYSIS OF AMINO-ACID INDEXES AND MUTATION MATRICES FOR SEQUENCE COMPARISON AND STRUCTURE PREDICTION OF PROTEINS, Protein engineering, 9(1), 1996, pp. 27-36
An amino acid index is a set of 20 numerical values representing any o
f the different physicochemical and biochemical properties of amino ac
ids, As a follow-up to the previous study, we have increased the size
of the database, which currently contains 402 published indices, and r
e-performed the single-linkage cluster analysis, The results basically
confirmed the previous findings, Another important feature of amino a
cids that can be represented numerically is the similarity between the
m, Thus, a similarity matrix, also called a mutation matrix, is a set
of 20x20 numerical values used for protein sequence alignments and sim
ilarity searches, We have collected 42 published matrices, performed h
ierarchical cluster analyses and identified several clusters correspon
ding to the nature of the data set and the method used for constructin
g the mutation matrix, Further, we have tried to reproduce each mutati
on matrix by the combination of amino acid indices in order to underst
and which properties of amino acids are reflected most. There was a re
lationship between the PAM units of Dayhoff's mutation matrix and the
volume and hydrophobicity of amino acids, The database of 402 amino ac
id indices and 42 amino acid mutation matrices is made publicly availa
ble on the Internet.