We show that the Vapnik-Chervonenkis dimension of the class of functio
ns that can be computed by arbitrary two-layer or some completely conn
ected three-layer threshold networks with real inputs is at least line
ar in the number of weights in the network. In Valiant's ''probably ap
proximately correct' learning framework, this implies that the number
of random training examples necessary for learning in these networks i
s at least linear in the number of weights.