The frequency and distribution of genetic polymorphism in the human genome
is a question of major importance. We have studied this in highly conserved
genes, which encode crucial functions such as DNA replication, mRNA transc
ription, and translation. Evolutionary comparisons suggest that these genes
are under particularly strong selective pressure, and their frequency of n
ucleotide sequence polymorphism would be expected to represent a minimum es
timate for sequence variation throughout the genome. We have analyzed the c
omplete coding sequence and the 3'-untranslated region (3'-UTR) of 22 human
genes, most of which have homologs in all cellular organisms and all of wh
ich are at least 25% amino acid identical to homologs in yeast. Comparisons
with similar studies of less conserved human disease genes indicate that 1
) evolutionarily conserved genes are, on average, less polymorphic than dis
ease related genes; 2) the difference in polymorphism levels is attributabl
e almost entirely to reduced levels of variation in protein coding sequence
s, whereas noncoding sequences have similar levels of polymorphism; and 3)
the character of polymorphism, in terms of the spectrum and frequency of mu
tational changes, is similar.