The COG database: new developments in phylogenetic classification of proteins from complete genomes

Citation
Rl. Tatusov et al., The COG database: new developments in phylogenetic classification of proteins from complete genomes, NUCL ACID R, 29(1), 2001, pp. 22-28
Citations number
11
Categorie Soggetti
Biochemistry & Biophysics
Journal title
NUCLEIC ACIDS RESEARCH
ISSN journal
03051048 → ACNP
Volume
29
Issue
1
Year of publication
2001
Pages
22 - 28
Database
ISI
SICI code
0305-1048(20010101)29:1<22:TCDNDI>2.0.ZU;2-M
Abstract
The database of Clusters of Orthologous Groups of proteins (COGs), which re presents an attempt on a phylogenetic classification of the proteins encode d in complete genomes, currently consists of 2791 COGs including 45 350 pro teins from 30 genomes of bacteria, archaea and the yeast Saccharomyces cere visiae (http://www.ncbi.nlm.nih,gov/COG). In addition, a supplement to the COGs is available, in which proteins encoded in the genomes of two multicel lular eukaryotes, the nematode Caenorhabditis elegans and the fruit fly Dro sophila melanogaster, and shared with bacteria and/or archaea were included . The new features added to the COG database include information pages with structural and functional details on each COG and literature references, i mprovements of the COGNITOR program that is used to fit new proteins into t he COGs, and classification of genomes and COGs constructed by using princi pal component analysis.