Window training, based on an extended form of stochastic approximation
, offers a means of producing linear classifiers that minimize the pro
bability of misclassification of statistically generated data. Associa
ted with window training is a window criterion function. We show that
minimizing the window criterion function yields a linear classifier th
at minimizes the probability of misclassification (i.e., the ''error r
ate''). However window training may produce a local minimum that excee
ds the global minimum error rate. We show that this defect does not oc
cur in the error-correcting perceptron. The criterion minimized by tha
t training procedure is ''convex''; i.e., the perceptron criterion has
only one local minimum. Consequently we recommend that window trainin
g be preceded by perceptron training, the perceptron training producin
g a decision surface which the window training process will move to a
position that is likely to be globally optimum. When a significantly l
arge set of exemplars of the data is available at the beginning of the
training process, the basis exchange algorithm offers a computational
ly convenient alternative to the window training algorithm to achieve
a locally minimum error rate. The basis exchange algorithm finds the l
ocal minimum of a window criterion in a finite number of steps-approxi
mately 3d steps, where d is the dimensionality of feature space. Windo
w training, on the other hand, may require an indefinitely long time t
o converge to a locally minimum error rate.