J. Wakeley et al., The discovery of single-nucleotide polymorphisms - and inferences about human demographic history, AM J HU GEN, 69(6), 2001, pp. 1332-1347
Citations number
42
Categorie Soggetti
Research/Laboratory Medicine & Medical Tecnology","Molecular Biology & Genetics
A method of historical inference that accounts for ascertainment bias is de
veloped and applied to single-nucleotide polymorphism (SNP) data in humans.
The data consist of 84 short fragments of the genome that were selected, f
rom three recent SNP surveys, to contain at least two polymorphisms in thei
r respective ascertainment samples and that were then fully resequenced in
47 globally distributed individuals. Ascertainment bias is the deviation, f
rom what would be observed in a random sample, caused either by discovery o
f polymorphisms in small samples or by locus selection based on levels or p
atterns of polymorphism. The three SNP surveys from which the present data
were derived differ both in their protocols for ascertainment and in the si
ze of the samples used for discovery. We implemented a Monte Carlo maximum-
likelihood method to fit a subdivided-population model that includes a poss
ible change in effective size at some time in the past. Incorrectly assumin
g that ascertainment bias does not exist causes errors in inference, affect
ing both estimates of migration rates and historical changes in size. Migra
tion rates are overestimated when ascertainment bias is ignored. However, t
he direction of error in inferences about changes in effective population s
ize (whether the population is inferred to be shrinking or growing) depends
on whether either the numbers of SNPs per fragment or the SNP-allele frequ
encies are analyzed. We use the abbreviation "SDL," for "(S) under bar NP-(
d) under bar iscovered locus," in recognition of the genomic-discovery cont
ext of SNPs. When ascertainment bias is modeled fully, both the number of S
NPs per SDL and their allele frequencies support a scenario of growth in ef
fective size in the context of a subdivided population. If subdivision is i
gnored, however, the hypothesis of constant effective population size canno
t be rejected. An important conclusion of this work is that, in demographic
or other studies, SNP data are useful only to the extent that their ascert
ainment can be modeled.