Cost models based on the clustering factor (CF) of the attributes have
been proposed and shown to be attractive for block access estimation
in databases, thanks to their accuracy and economy of use. While query
optimizers can use the actual CFs, measured from the data, physical d
esign methods and tools must rely on estimates before the data are sto
red. In this paper we present a CF estimation procedure which can be a
pplied to totally clustered attributes (e.g. ordered attributes). Simp
le and accurate approximations of the derived formulas are also introd
uced. Simulations show the accuracy of the proposed CF estimates and t
he improvement in their behaviour compared to previously published est
imates. Reliability for physical design of cost models based on the CF
in the presence of a skewed data distribution is also discussed.