From complete genomes to measures of substitution rate variability within and between proteins

Citation
Nv. Grishin et al., From complete genomes to measures of substitution rate variability within and between proteins, GENOME RES, 10(7), 2000, pp. 991-1000
Citations number
44
Categorie Soggetti
Molecular Biology & Genetics
Journal title
GENOME RESEARCH
ISSN journal
10889051 → ACNP
Volume
10
Issue
7
Year of publication
2000
Pages
991 - 1000
Database
ISI
SICI code
1088-9051(200007)10:7<991:FCGTMO>2.0.ZU;2-4
Abstract
Accumulation of complete genome sequences of diverse organisms creates new possibilities for evolutionary inferences from whole-genome comparisons. In the present study, we analyze the distributions of substitution rates amon g proteins encoded in 19 complete genomes (the interprotein rate distributi on). To estimate these rates, it is necessary to employ another fundamental distribution, that of the substitution rates among sites in proteins (the intraprotein distribution]. Using two independent approaches, we show that intraprotein substitution rate variability appears to be significantly grea ter than generally accepted. This yields more realistic estimates of evolut ionary distances from amino-acid sequences, which is critical for evolution ary-tree construction. We demonstrate that the interprotein rate distributi ons inferred From the genome-to-genome comparisons are similar to each othe r and can be approximated by a single distribution with a long exponential shoulder. This suggests that a generalized version of the molecular clock h ypothesis may be valid on genome scale. We also use the scaling parameter o f the obtained interprotein rate distribution to construct a rooted whole-g enome phylogeny. The topology of the resulting tree is largely compatible w ith those of global rRNA-based trees and trees produced by other approaches to genome-wide comparison.