In 1970, before any antigen-bound immunoglobulin structure had been so
lved, Elvin Kabat proposed that regions of high amino acid diversity w
ould be the antigen binding sites of immunoglobulin (Kabat, 1970). Con
versely, sites of low variability were proposed to be structural, fram
ework regions. This variability was defined by Wu and Kabat as the num
ber of different amino acids found at a site divided by the relative f
requency of the most common amino acid at that site (Wu and Kabat, 197
0). Several groups have subsequently devised improvements of Kabat-Wu
variability analysis (Litwin and Jores, 1992). While these methods are
somewhat better than Kabat-Wu, they still suffer from Kabat-Wu's basi
c limitation: they account for only the most common one or two amino a
cids in estimating diversity. This leads to underestimates of low dive
rsities and exaggerations of high diversities. Shannon information ana
lysis eliminates serious bias and is more stable than Kabat-Wu and sec
ond generation measures of diversity (Jores et al. 1990; Wu and Kabat,
1970). Statistical reliability can be measured using Shannon analysis
, and Shannon measurements can be provided with error estimates. Here
we use Shannon's method to analyze the amino acid diversity at each si
te of T cell receptor V-alpha and V-beta to identify complementarity d
etermining regions and framework sites. Our results reveal that the T
cell receptor is significantly more diverse than immunoglobulin-sugges
ting T cell receptor has more than the previously-discovered four comp
lementarity determining regions. These new complementarity determining
regions may represent a larger antigen combining site, additional com
bining sites, or an evolutionary strategy to avoid inappropriate inter
action with other molecules. (C) 1998 Elsevier Science Ltd. All rights
reserved.