R. Deka et al., Rate and directionality of mutations and effects of allele size constraints at anonymous, gene-associated, and disease-causing trinucleotide loci, MOL BIOL EV, 16(9), 1999, pp. 1166-1177
We studied the patterns of within- and between-population variation at 29 t
rinucleotide loci in a random sample of 200 healthy individuals from four d
iverse populations: Germans, Nigerians, Chinese, and New Guinea highlanders
. The loci were grouped as disease-causing (seven loci with CAG repeats), g
ene-associated (seven loci with CAG/CCG repeats and eight loci with AAT rep
eats), or anonymous (seven loci with AAT repeats). We used heterozygosity a
nd Variance of allele size (expressed in units of repeat counts) as measure
s of within-population variability and G(ST) (based on heterozygosity as we
ll as on allele size variance) as the measure of genetic differentiation be
tween populations. Our observations are: (1) locus type is the major signif
icant factor for differences in within-population genetic variability; (2)
the disease-causing CAG repeats tin the nondisease range of repeat counts)
have the highest within-population variation, followed by the AAT-repeat an
onymous loci, the AAT-repeat gene-associated loci, and the CAG/CTG-repeat g
ene-associated loci; (3) an imbalance index beta, the ratio of the estimate
s of the product of effective population size and mutation rate based on al
lele size variance and heterozygosity, is the largest for disease-causing l
oci, followed by AAT- and CAG/CCG-repeat gene-associated loci and AAT-repea
t anonymous loci; (4) mean allele size correlates positively with allele si
ze variance for AAT- and CAG/CCG-repeat gene-associated loci and negatively
for anonymous loci; and (5) G(ST) is highest for the disease-causing loci.
These observations are explained by specific differences of rates and patt
erns of mutations in these four groups of trinucleotide loci, taking into c
onsideration the effects of the past demographic history of the modern huma
n population.