D. Okane et O. Winther, LEARNING TO CLASSIFY IN LARGE COMMITTEE MACHINES, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics, 50(4), 1994, pp. 3201-3209
The ability of a two-layer neural network to learn a specific non-line
arly-separable classification task, the proximity problem, is investig
ated using a statistical mechanics approach. Both the tree and fully c
onnected architectures are investigated in the limit where the number
K of hidden units is large, but still much smaller than the number N o
f inputs. Both have continuous weights. Within the replica symmetric a
nsatz, we find that for zero temperature training, the tree architectu
re exhibits a strong overtraining effect. For nonzero temperature the
asymptotic error is lowered, but it is still higher than the correspon
ding value for the simple perceptron. The fully connected architecture
is considered for two regimes. First, for a finite number of examples
we find a symmetry among the hidden units as each performs equally we
ll. The asymptotic generalization error is finite, and minimal for T -
-> infinity where it goes to the same value as for the simple perceptr
on. For a large number of examples we find a continuous transition to
a phase with broken hidden-unit symmetry, which has an asymptotic gene
ralization error equal to zero.