The number of adjustments required to learn the average LTU function of d f
eatures, each of which can take on n equally spaced values, grows as approx
imately n(2d) when the standard perceptron training algorithm is used on th
e complete input space of n points and perfect classification is required.
We demonstrate a simple modification that reduces the observed growth rate
in the number of adjustments to approximately d(2)(log (d) + log(n)) with m
ost, but not all input presentation orders. A similar speed-up is also prod
uced by applying the simple but computationally expensive heuristic "don't
overgeneralize" to the standard training algorithm. This performance is ver
y close to the theoretical optimum for learning LTU functions by any method
, and is evidence that perceptron-like learning algorithms can learn arbitr
ary LTU functions in polynomial, rather than exponential time under normal
training conditions. Similar modifications can be applied to the Winnow alg
orithm, achieving similar performance improvements and demonstrating the ge
nerality of the approach.