An analysis of the frequency of use of amino acids on the CDR-1 and CD
R-2 of 1500 immunoglobulins showed that the frequencies of amino acids
in different positions could be fitted by two types of distribution.
For some positions the frequencies were fitted by an inverse power law
and for other positions by an exponential distribution. In order to s
ee whether the more frequently used amino acids for specific positions
had physicochemical properties or attributes in common, they were clu
stered using an algorithm normally applied to artificial intelligence
problems. It was found that the amino acids in those positions fitted
by the inverse power law have similar hydrophobicity and volume, which
are commonly attributes of amino acids in structural positions. Thus,
if these positions are critical to maintaining the structural feature
s of the CDR domains, the rest of the positions should be either prope
rly involved in the recognition process or irrelevant. The frequencies
of amino acids in these recognition positions were fitted by the expo
nential law, and it was found by the clustering analysis that these am
ino acids share properties of a more general type, such as capability
of forming hydrogen bonds, polarity, etc. This suggests that at least
part of the recognition mechanism requires general properties rather t
han specific amino acids. Amino acids sharing the required attributes
for each one of these positions are then used with random frequency.