THE MEAN AND VARIANCE OF THE NUMBER OF SEGREGATING SITES SINCE THE LAST HITCHHIKING EVENT

Citation
M. Perlitz et W. Stephan, THE MEAN AND VARIANCE OF THE NUMBER OF SEGREGATING SITES SINCE THE LAST HITCHHIKING EVENT, Journal of mathematical biology, 36(1), 1997, pp. 1-23
Citations number
30
Categorie Soggetti
Mathematical Methods, Biology & Medicine","Biology Miscellaneous","Mathematics, Miscellaneous
ISSN journal
03036812
Volume
36
Issue
1
Year of publication
1997
Pages
1 - 23
Database
ISI
SICI code
0303-6812(1997)36:1<1:TMAVOT>2.0.ZU;2-2
Abstract
Tight linkage may cause a reduction of nucleotide diversity in a chrom osomal region if an advantageous mutation appears in that region which is driven to fixation by directional selection. This process is usual ly called genetic hitchhiking. If selection is strong, the entire proc ess takes place during a time period of length 2/s ln (2N) that is ver y short relative to 2N generations [s is the selection coefficient of the advantageous mutation and N the effective diploid population size] . On the time scale of 2N generations, which is characteristic for neu tral evolution, we may therefore call this process a hitchhiking event . Using coalescent methods, we analyzed a model in which a hitchhiking event occurred in a chromosomal region of zero-recombination in the p ast at time x. Such a hitchhiking ''catastrophe'' wipes out completely genetic variation that existed in a population before that time. Stan ding variation observed at present must therefore be due to mutations that have arisen since time point x. Assuming that all newly arising m utations are neutral, we derived expressions for the expectation, vari ance and also for the higher moments of the number of nucleotide sites segregating in a sample of n genes as a function of x. The result for the first moment is then used to estimate the time back to the last h itchhiking event based on DNA polymorphism data from Drosophila. Assum ing that directional selection is the sole determinant of the level of genetic variation in the gene regions surveyed, we obtained estimates of x that were typically in the order of 0.1N generations.