ProClass is a protein family database that organizes non-redundant sequence
entries into families defined collectively by PROSITE patterns and PIR sup
erfamilies. By combining global similarities and functional motifs into a s
ingle classification scheme, ProClass helps to reveal domain and family rel
ationships and classify multi-domain proteins. The database currently consi
sts of more than 120 000 sequence entries, similar to 60% of which is class
ified into about 3500 families. To maximize family information retrieval, t
he database provides links to various protein family/domain and structural
class databases and contains multiple motif alignments of all PROSTTE patte
rns as well as global alignments of PIR superfamilies. The motif sequences
are retrieved from both PIR-Intemational and SWISS-PROT databases, includin
g a large number of new members detected by our GeneFIND family identificat
ion system. ProClass can be used to support full-scale genomic annotation,
because of its high classification rate. The ProClass database is available
for on-line search and record retrieval from our WWW server at http://dian
a.uthct.edu/proclass.html.