When a favourable mutation sweeps to fixation, those genes initially linked
to it increase in frequency; on average, this reduces diversity in the sur
rounding region of the genome. In the first analysis of this 'hitch-hiking'
effect, Maynard-Smith and Haigh (1974) followed the increase of the neutra
l allele that chanced to be associated with the new mutation in the first g
eneration, and assumed that the subsequent increase was deterministic. Late
r analyses, based on either coalescence arguments, or on diffusion equation
s for the mean and variance of allele frequency, have also made one or both
of these assumptions. In the early generations, stochastic fluctuations in
the frequency of the selected allele, and coalescence of neutral lineages,
can be accounted for correctly by following relationships between genes co
nditional on the number of copies of the favourable allele. This analysis s
hows that the hitch-hiking effect is increased because an allele that is de
stined to hx tends to increase more rapidly than exponentially. However, th
e identity generated by the selective sweep has the same form as in previou
s work, h[r/s] (2 Ns)(-2r/s), where h[r/s] tends to 1 with tight linkage. T
his analysis is extended to samples of many genes; then, genes may trace ba
ck to several families of lineages, each related through a common ancestor
early in the selective sweep. Simulations show that the number and sizes of
these families can (in principle) be used to make separate estimates of rl
s and Ns.