G. Lugosi et A. Nobel, CONSISTENCY OF DATA-DRIVEN HISTOGRAM METHODS FOR DENSITY-ESTIMATION AND CLASSIFICATION, Annals of statistics, 24(2), 1996, pp. 687-706
We present general sufficient conditions for the almost sure L(1)-cons
istency of histogram density estimates based on data-dependent partiti
ons. Analogous conditions guarantee the almost-sure risk consistency o
f histogram classification schemes based on data-dependent partitions.
Multivariate data are considered throughout. In each case, the desire
d consistency requires shrinking cells, subexponential growth of a com
binatorial complexity measure and sublinear growth of the number of ce
lls. It is not required that the cells of every partition be rectangle
s with sides parallel to the coordinate axis or that each cell contain
a minimum number of points. No assumptions are made concerning the co
mmon distribution of the training vectors. We apply the results to est
ablish the consistency of several known partitioning estimates, includ
ing the k(n)-spacing density estimate, classifiers based on statistica
lly equivalent blocks and classifiers based on multivariate clustering
schemes.