We report the latest release (version 1.6) of the CATH protein:domains data
base (http://www.biochem.ucl. ac.uk/bsm/cath), This is a hierarchical class
ification of 18 577 domains into evolutionary families and structural group
ings. We have identified 1028 homologous superfamilies in which the protein
s have both structural,and sequence or functional similarity. These can: be
further clustered into 672 fold groups and 35 distinct architectures. Rece
nt developments of the database include the generation of 30 templates for
recognising structural relatives in each fold group, which has led to signi
ficant improvements in the speed and accuracy of updating the database and
also means that less manual validation is required. We also report the esta
blishment of the CATH-PFDB (Protein Family Database), which associates 1D s
equences with the 3D homologous superfamilies. Sequences showing identifiab
le homology to entries in CATH have been extracted from GenBank using PSI-B
LAST. A CATH-PSIBLAST server has been established, which allows you to scan
a new sequence against the database. The CATH Dictionary of Homologous Sup
erfamilies (DHS), which contains validated multiple structural alignments a
nnotated with consensus functional information for evolutionary protein sup
erfamilies, has been updated to include annotations associated with sequenc
e relatives identified in GenBank. The DHS is a powerful tool for consideri
ng the variation of functional properties within a given OATH superfamily a
nd in deciding what functional properties may be reliably inherited by a ne
wly identified relative.