Efficient on-line nonparametric kernel density estimation

Citation
Cg. Lambert et al., Efficient on-line nonparametric kernel density estimation, ALGORITHMIC, 25(1), 1999, pp. 37-57
Citations number
34
Categorie Soggetti
Engineering Mathematics
Journal title
ALGORITHMICA
ISSN journal
01784617 → ACNP
Volume
25
Issue
1
Year of publication
1999
Pages
37 - 57
Database
ISI
SICI code
0178-4617(199909)25:1<37:EONKDE>2.0.ZU;2-5
Abstract
Nonparametric density estimation has broad applications in computational fi nance especially in cases where high frequency data are available. However, the technique is often intractable, given the run times necessary to evalu ate a density. We present a new and efficient algorithm based on multipole techniques. Given the n kernels that estimate the density, current methods take O(n) time directly to sum the kernels to perform a single density quer y. In an on-line algorithm where points are continually added to the densit y, the cumulative O (n(2)) running time for n queries makes it very costly, if not impractical, to compute the density for large n. Our new Multipole- accelerated On-line Density Estimation (MODE) algorithm is general in that it can be applied to any kernel tin arbitrary dimensions) that admits a Tay lor series expansion. The running time for a density query reduces to O(log n) or even constant time, depending on the kernel chosen, and, hence, the cumulative running time is reduced to O (n log n) or O (n), respectively. O ur results show that the MODE algorithm provides dramatic advantages over t he direct approach to density evaluation. For example, we show using a mode st computing platform that on-line density updates and queries for 1 millio n points and two dimensions take 8 days to compute using the direct approac h versus 40 seconds with the MODE approach.