We study pruning strategies in simple perceptrons subjected to supervised l
earning. Our analytical results, obtained through the statistical mechanics
approach to learning theory, are independent of the learning algorithm use
d in the training process. We calculate the post-training distribution P(J)
of synaptic weights, which depends only on the overlap rho (0) achieved by
the learning algorithm before pruning and the fraction kappa of relevant w
eights in the teacher network. From this distribution, we calculate the opt
imal pruning strategy for deleting small weights. The optimal pruning thres
hold grows from zero as theta (opt)(rho (0), kappa)proportional to[rho (0)
- rho (c)(kappa)](1/2) above some critical value rho (c)(kappa). Thus, the
elimination of weak synapses enhances the network performance only after a
critical learning period. Possible implications for biological pruning phen
omena are discussed.