Space-efficient online computation of quantile summaries

Citation
M. Greenwald et S. Khanna, Space-efficient online computation of quantile summaries, SIG RECORD, 30(2), 2001, pp. 58-66
Citations number
13
Categorie Soggetti
Computer Science & Engineering
Journal title
SIGMOD RECORD
ISSN journal
01635808 → ACNP
Volume
30
Issue
2
Year of publication
2001
Pages
58 - 66
Database
ISI
SICI code
0163-5808(200106)30:2<58:SOCOQS>2.0.ZU;2-J
Abstract
An E-approximate quantile summary of a sequence of N elements is a data str ucture that can answer quantile queries about the sequence to within a prec ision of epsilonN. We present a new online algorithm for computing E-approximate quantile summ aries of very large data sequences. The algorithm has a worst-case space re quirement of O(1/epsilon log(epsilonN)). This improves upon the previous be st result of O(1/epsilon log(2) (epsilonN)). Moreover, in contrast to earli er deterministic algorithms, our algorithm does not require a priori knowle dge of the length of the input sequence. Finally, the actual space bounds obtained on experimental data are signific antly better than the worst case guarantees of our algorithm as well as the observed space requirements of earlier algorithms.