Domain combinations in archaeal, eubacterial and eukaryotic proteomes

Citation
G. Apic et al., Domain combinations in archaeal, eubacterial and eukaryotic proteomes, J MOL BIOL, 310(2), 2001, pp. 311-325
Citations number
24
Categorie Soggetti
Molecular Biology & Genetics
Journal title
JOURNAL OF MOLECULAR BIOLOGY
ISSN journal
00222836 → ACNP
Volume
310
Issue
2
Year of publication
2001
Pages
311 - 325
Database
ISI
SICI code
0022-2836(20010706)310:2<311:DCIAEA>2.0.ZU;2-L
Abstract
There is a limited repertoire of domain families that are duplicated and co mbined in different ways to form the set of proteins in a genome. Proteins are gene products, and at the level of genes, duplication, recombination, f usion and fission are the processes that produce new genes. We attempt to g ain an overview of these processes by studying the evolutionary units in pr oteins, domains, in the protein sequences of 40 genomes. The domain and sup erfamily definitions in the Structural Classification of Proteins Database are used, so that we can view all pairs of adjacent domains in genome seque nces in terms of their superfamily combinations. We find 783 out of the 859 superfamilies in SCOP in these genomes, and the 783 families occur in 1307 pairwise combinations. Most families are observed in combination with one or two other families, while a few families are very versatile in their com binatorial behaviour, 209 families do not make combinations with other fami lies. This type of pattern can be described as a scale-free network. We als o study the N to C-terminal orientation of domain pairs and domain repeats. The phylogenetic distribution of domain combinations is surveyed, to estab lish the extent of common and kingdom-specific combinations. Of the kingdom -specific combinations, significantly more combinations consist of families present in all three kingdoms than of families present in one or two kingd oms. Hence, we are led to conclude that recombination between common famili es, as compared to the invention of new families and recombination among th ese, has also been a major contribution to the evolution of kingdom-specifi c and species-specific functions in organisms in all three kingdoms. Finall y, we compare the set of the domain combinations in the genomes to those in the RCSB Protein Data Bank, and discuss the implications for structural ge nomics. (C) 2001 Academic Press.