A Web-based classification system of DNA-binding protein families

Citation
M. Karmirantzou et Sj. Hamodrakas, A Web-based classification system of DNA-binding protein families, PROTEIN ENG, 14(7), 2001, pp. 465-472
Citations number
32
Categorie Soggetti
Biochemistry & Biophysics
Journal title
PROTEIN ENGINEERING
ISSN journal
02692139 → ACNP
Volume
14
Issue
7
Year of publication
2001
Pages
465 - 472
Database
ISI
SICI code
0269-2139(200107)14:7<465:AWCSOD>2.0.ZU;2-2
Abstract
Rational classification of proteins encoded in sequenced genomes is critica l for making the genome sequences maximally useful for functional and evolu tionary studies. The family of DNA-binding proteins is one of the most popu lated and studied amongst the various genomes of bacteria, archaea and euka ryotes and the Web-based system presented here is an approach to their clas sification. The DnaProt resource is an annotated and searchable collection of protein sequences for the families of DNA-binding proteins. The database contains 3238 full-length sequences (retrieved from the SWISS-PROT databas e, release 38) that include, at least, a DNA-binding domain. Sequence entri es are organized into families defined by PROSITE patterns, PRINTS motifs a nd de novo excised signatures. Combining global similarities and functional motifs into a single classification scheme, DNA-binding proteins are class ified into 33 unique classes, which helps to reveal comprehensive family re lationships. To maximize family information retrieval, DnaProt contains a c ollection of multiple alignments for each DNA-binding family while the reco gnized motifs can be used as diagnostically functional fingerprints. All av ailable structural class representatives have been referenced. The resource was developed as a Web-based management system for online free access of c ustomized data sets. Entries are fully hyperlinked to facilitate easy retri eval of the original records from the source databases while functional and phylogenetic annotation will be applied to newly sequenced genomes. The da tabase is freely available for online search of a library containing specif ic patterns of the identified DNA-binding protein classes and retrieval of individual entries from our WWW server (http://kronos.biol.uoa.gr/similar t o mariak/dbDNA.html).