ITA
ENG

RESERVOIR-SAMPLING ALGORITHMS OF TIME-COMPLEXITY O(N(1+LOG(N N)))/

Authors

LI KH

Citation

Kh. Li, RESERVOIR-SAMPLING ALGORITHMS OF TIME-COMPLEXITY O(N(1+LOG(N N)))/, ACM transactions on mathematical software, 20(4), 1994, pp. 481-493

Citations number

Categorie Soggetti

Computer Sciences",Mathematics

Journal title

ACM transactions on mathematical software → ACNP

ISSN journal

00983500

Volume

Issue

Year of publication

1994

Pages

481 - 493

Database

ISI

SICI code

0098-3500(1994)20:4<481:RAOTON>2.0.ZU;2-2

Abstract

One-pass algorithms for sampling n records without replacement from a population of unknown size N are known as reservoir-sampling algorithm s. In this article, Vitter's reservoir-sampling algorithm, algorithm Z , is modified to give a more efficient algorithm, algorithm K. Additio nally, two new algorithms, algorithm L and algorithm M, are proposed. If the time for scanning the population is ignored, all the four algor ithms have expected CPU time O(n(1 + log(N/n))), which is optimum up t o a constant factor. Expressions of the expected CPU time for the algo rithms are presented. Among the four, algorithm L is the simplest, and algorithm M is the most efficient when n and N/n are large and N is O (n(2)).