The ancestry of a sample of sequences subject to recombination

Authors
Citation
C. Wiuf et J. Hein, The ancestry of a sample of sequences subject to recombination, GENETICS, 151(3), 1999, pp. 1217-1228
Citations number
13
Categorie Soggetti
Biology,"Molecular Biology & Genetics
Journal title
GENETICS
ISSN journal
00166731 → ACNP
Volume
151
Issue
3
Year of publication
1999
Pages
1217 - 1228
Database
ISI
SICI code
0016-6731(199903)151:3<1217:TAOASO>2.0.ZU;2-J
Abstract
In this article we discuss the ancestry of sequences sampled fr om the coal escent with recombination with constant population size 2N. We have studied a number of variables based on simulations of sample histories, and some a nalytical results are derived. Consider the leftmost nucleotide in the sequ ences. We show that die number of nucleotides sharing a most recent common ancestor (MRCA) with the leftmost nucleotide is approximate to log(1 + 4N L r)/4Nr when two sequences are compared, where L denotes sequence length in nucleotides, and r the recombination rate between any two neighboring nucle otides per generation. For larger samples, the number of nucleotides sharin g MRCA with the leftmost nucleotide decreases and becomes almost independen t of 4N Lr. Further, we show that a segment of the sequences sharing a MRCA consists in mean of 3/8Nr nucleotides, when two sequences are compared, an d that this decreases toward 1/4Nr nucleotides when the whole population is sampled. A measure of the correlation between the genealogies of two nucle otides on two sequences is introduced. We show analytically that even when the nucleotides are separated by a large genetic distance, but share MRCA, the genealogies will show only little correlation. This is surprising, beca use the time until the two nucleotides shared MRCA is reciprocal to die gen etic distance. Using simulations, the mean time until all positions in the sample have found a MRCA increases logarithmically with increasing sequence length and is considerably lower than a theoretically predicted upper boun d. On the basis of simulations, it turns our that important properties of t he coalescent with recombinations of the whole population are reflected in the properties of a sample of low size.