The discovery of single-nucleotide polymorphisms - and inferences about human demographic history

Citation
J. Wakeley et al., The discovery of single-nucleotide polymorphisms - and inferences about human demographic history, AM J HU GEN, 69(6), 2001, pp. 1332-1347
Citations number
42
Categorie Soggetti
Research/Laboratory Medicine & Medical Tecnology","Molecular Biology & Genetics
Journal title
AMERICAN JOURNAL OF HUMAN GENETICS
ISSN journal
00029297 → ACNP
Volume
69
Issue
6
Year of publication
2001
Pages
1332 - 1347
Database
ISI
SICI code
0002-9297(200112)69:6<1332:TDOSP->2.0.ZU;2-Y
Abstract
A method of historical inference that accounts for ascertainment bias is de veloped and applied to single-nucleotide polymorphism (SNP) data in humans. The data consist of 84 short fragments of the genome that were selected, f rom three recent SNP surveys, to contain at least two polymorphisms in thei r respective ascertainment samples and that were then fully resequenced in 47 globally distributed individuals. Ascertainment bias is the deviation, f rom what would be observed in a random sample, caused either by discovery o f polymorphisms in small samples or by locus selection based on levels or p atterns of polymorphism. The three SNP surveys from which the present data were derived differ both in their protocols for ascertainment and in the si ze of the samples used for discovery. We implemented a Monte Carlo maximum- likelihood method to fit a subdivided-population model that includes a poss ible change in effective size at some time in the past. Incorrectly assumin g that ascertainment bias does not exist causes errors in inference, affect ing both estimates of migration rates and historical changes in size. Migra tion rates are overestimated when ascertainment bias is ignored. However, t he direction of error in inferences about changes in effective population s ize (whether the population is inferred to be shrinking or growing) depends on whether either the numbers of SNPs per fragment or the SNP-allele frequ encies are analyzed. We use the abbreviation "SDL," for "(S) under bar NP-( d) under bar iscovered locus," in recognition of the genomic-discovery cont ext of SNPs. When ascertainment bias is modeled fully, both the number of S NPs per SDL and their allele frequencies support a scenario of growth in ef fective size in the context of a subdivided population. If subdivision is i gnored, however, the hypothesis of constant effective population size canno t be rejected. An important conclusion of this work is that, in demographic or other studies, SNP data are useful only to the extent that their ascert ainment can be modeled.