Assigning genomic sequences to CATH

Citation
Fmg. Pearl et al., Assigning genomic sequences to CATH, NUCL ACID R, 28(1), 2000, pp. 277-282
Citations number
28
Categorie Soggetti
Biochemistry & Biophysics
Journal title
NUCLEIC ACIDS RESEARCH
ISSN journal
03051048 → ACNP
Volume
28
Issue
1
Year of publication
2000
Pages
277 - 282
Database
ISI
SICI code
0305-1048(20000101)28:1<277:AGSTC>2.0.ZU;2-H
Abstract
We report the latest release (version 1.6) of the CATH protein:domains data base (http://www.biochem.ucl. ac.uk/bsm/cath), This is a hierarchical class ification of 18 577 domains into evolutionary families and structural group ings. We have identified 1028 homologous superfamilies in which the protein s have both structural,and sequence or functional similarity. These can: be further clustered into 672 fold groups and 35 distinct architectures. Rece nt developments of the database include the generation of 30 templates for recognising structural relatives in each fold group, which has led to signi ficant improvements in the speed and accuracy of updating the database and also means that less manual validation is required. We also report the esta blishment of the CATH-PFDB (Protein Family Database), which associates 1D s equences with the 3D homologous superfamilies. Sequences showing identifiab le homology to entries in CATH have been extracted from GenBank using PSI-B LAST. A CATH-PSIBLAST server has been established, which allows you to scan a new sequence against the database. The CATH Dictionary of Homologous Sup erfamilies (DHS), which contains validated multiple structural alignments a nnotated with consensus functional information for evolutionary protein sup erfamilies, has been updated to include annotations associated with sequenc e relatives identified in GenBank. The DHS is a powerful tool for consideri ng the variation of functional properties within a given OATH superfamily a nd in deciding what functional properties may be reliably inherited by a ne wly identified relative.