A nonrestrictive method for identifying covariance in protein families is d
escribed and applied to human and mouse germline V kappa and VH sequence al
ignments. Amino acids that occur at each position in a sequence alignment a
re divided into two sets, called sword, by generating all possible combinat
ions of alternative amino acids. Each word is associated with a pattern of
changes. Words with identical patterns identify covariant positions. In ant
ibody variable domains, the number of words generated ranged between 1103 a
nd 2195 depending on the alignment, of which 4 to 12 % occurred in covarian
t pairs. Despite the nonrestrictive character of pattern generation, covari
ant residues did not reflect a random selection with respect to the nature
of amino acid changers and/or their spatial proximity in a reference crysta
llographic structure. This approach allowed the identification of a covaria
nce signal for positions with high variability, mostly located in the outer
part of the common structural framework of antibody variable domains. Cova
riance in these regions may reflect the existence of alternative and mutual
ly exclusive atomic arrangements that are compatible with antibody function
. The method may be of general applicability to rationalize residue variabi
lity in protein families. Proteins 2000;41:475-484. (C) 2000 Wiley-Liss, In
c.