One-pass algorithms for sampling n records without replacement from a
population of unknown size N are known as reservoir-sampling algorithm
s. In this article, Vitter's reservoir-sampling algorithm, algorithm Z
, is modified to give a more efficient algorithm, algorithm K. Additio
nally, two new algorithms, algorithm L and algorithm M, are proposed.
If the time for scanning the population is ignored, all the four algor
ithms have expected CPU time O(n(1 + log(N/n))), which is optimum up t
o a constant factor. Expressions of the expected CPU time for the algo
rithms are presented. Among the four, algorithm L is the simplest, and
algorithm M is the most efficient when n and N/n are large and N is O
(n(2)).