An E-approximate quantile summary of a sequence of N elements is a data str
ucture that can answer quantile queries about the sequence to within a prec
ision of epsilonN.
We present a new online algorithm for computing E-approximate quantile summ
aries of very large data sequences. The algorithm has a worst-case space re
quirement of O(1/epsilon log(epsilonN)). This improves upon the previous be
st result of O(1/epsilon log(2) (epsilonN)). Moreover, in contrast to earli
er deterministic algorithms, our algorithm does not require a priori knowle
dge of the length of the input sequence.
Finally, the actual space bounds obtained on experimental data are signific
antly better than the worst case guarantees of our algorithm as well as the
observed space requirements of earlier algorithms.