We compare the performance of sampling-based procedures for estimating
the selectivity of a join. While some of the procedures have been pro
posed in the database literature, their relative performance has never
been analyzed. A main result of this paper is a partial ordering that
compares the variability of the estimators for the different procedur
es after an arbitrary fixed number of sampling steps. Prior to the cur
rent work, it was also unknown whether these fixed-step procedures cou
ld be extended to fixed-precision procedures that are both asymptotica
lly consistent and asymptotically efficient.: Our second main result i
s a general method for such an extension and a proof that the method i
s valid for all the procedures under consideration. We show that, unde
r plausible assumptions on sampling costs, the partial ordering of the
fixed-step procedures with respect to variability of the selectivity
estimator implies a partial ordering of the corresponding fixed-precis
ion procedures with respect to sampling cost. Our final result is a co
llection of fixed-step and fixed-precision procedures for estimating t
he cost of processing a join query according to a fixed join plan. (C)
1996 Academic Press, Inc.