I. Derenyi et al., GENERALIZATION IN THE PROGRAMMED TEACHING OF A PERCEPTRON, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics, 50(4), 1994, pp. 3192-3200
According to a widely used model of learning and generalization in neu
ral networks, a single neuron (perceptron) can learn from examples to
imitate another neuron, called the teacher perceptron. We introduce a
variant of this model in which examples within a layer of thickness 2Y
around the decision surface are excluded from teaching. That restrict
ion transmits global information about the teacher's rule. Therefore f
or a given number p = alpha N of presented examples (i.e., those outsi
de of the layer) the generalization performance obtained by Boltzmanni
an learning is improved by setting Y to an optimum value Y-0(alpha), w
hich diverges for alpha --> 0 and remains nonzero while alpha < alpha(
c) approximate to 5.7. That suggests programed learning: easy examples
should be taught first.