ITA
ENG

Distribution of protein folds in the three superkingdoms of life

Authors

Wolf, YI Brenner, SE Bash, PA Koonin, EV

Citation

Yi. Wolf et al., Distribution of protein folds in the three superkingdoms of life, GENOME RES, 9(1), 1999, pp. 17-26

Citations number

Categorie Soggetti

Molecular Biology & Genetics

Journal title

GENOME RESEARCH

ISSN journal

10889051 → ACNP

Volume

Issue

Year of publication

1999

Pages

17 - 26

Database

ISI

SICI code

1054-9803(199901)9:1<17:DOPFIT>2.0.ZU;2-8

Abstract

A sensitive protein-fold recognition procedure was developed on the basis o f iterative database search using the PSI-BLAST program. A collection of 11 93 position-dependent weight matrices that can be used as fold identifiers was produced. In the completely sequenced genomes, folds could be automatic ally identified for 20%-30% of the proteins, with 3%-6% more detectable by additional analysis of conserved motifs. The distribution of the most commo n folds is very similar in bacteria and archaea but distinct in eukaryotes. Within the bacteria, this distribution differs between parasitic and free- living species. In all analyzed genomes, the P-loop NTPases are the most ab undant fold. In bacteria and archaea, the next most common folds are ferred oxin-like domains, TIM-barrels, and methyltransferases, whereas in eukaryot es, the second to fourth places belong to protein kinases, beta-propellers and TIM-barrels. The observed diversity of protein folds in different prote omes is approximately twice as high as it would be expected from a simple s tochastic model describing a proteome as a finite sample from an infinite p ool of proteins with an exponential distribution of the fold fractions. Dis tribution of the number of domains with different folds in one protein fits the geometric model, which is compatible with the evolution of multidomain proteins by random combination of domains.