A memory-based learning system is an extended memory management system
that decomposes the input space either statistically or dynamically i
nto subregions for the purpose of storing and retrieving functional in
formation. The main generalization techniques employed by memory-based
learning systems are the nearest-neighbor search, space decomposition
techniques, and clustering. Research on memory-based learning is stil
l in its early stage. In particular, there are very few rigorous theor
etical results regarding memory requirement, sample size, expected per
formance, and computational complexity. In this paper, we propose a mo
del for memory-based learning and use it to analyze several methods-is
-an-element-of-covering, hashing, clustering, tree-structured clusteri
ng, and receptive-fields-for learning smooth functions. The sample siz
e and system complexity are derived for each method. Our model is buil
t upon the generalized PAC learning model of Haussler (Haussler, 1989)
and is closely related to the method of vector quantization in data c
ompression. Our main result is that we can build memory-based learning
systems using new clustering algorithms (Lin & Vitter, 1992a) to PAC-
learn in polynomial time using only polynomial storage in typical situ
ations.