The performance of the error backpropagation (BP) and ID3 learning alg
orithms was compared on the task of mapping English text to phonemes a
nd stresses. Under the distributed output code developed by Sejnowski
and Rosenberg, it is shown that BP consistently out-performs ID3 on th
is task by several percentage points. Three hypotheses explaining this
difference were explored: (a) ID3 is overfitting the training data, (
b) BP is able to share hidden units across several output units and he
nce can learn the output units better, and (c) BP captures statistical
information that ID3 does not. We conclude that only hypothesis (c) i
s correct. By augmenting ID3 with a simple statistical learning proced
ure, the performance of BP can be closely matched. More complex statis
tical procedures can improve the performance of both BP and ID3 substa
ntially in this domain.