ITA
ENG

EXPLICIT SYMMETRIES AND THE CAPACITY OF MULTILAYER NEURAL NETWORKS

Authors

SAAD D

Citation

D. Saad, EXPLICIT SYMMETRIES AND THE CAPACITY OF MULTILAYER NEURAL NETWORKS, Journal of physics. A, mathematical and general, 27(8), 1994, pp. 2719-2734

Citations number

Categorie Soggetti

Physics

Journal title

Journal of physics. A, mathematical and general → ACNP

ISSN journal

03054470

Volume

Issue

Year of publication

1994

Pages

2719 - 2734

Database

ISI

SICI code

0305-4470(1994)27:8<2719:ESATCO>2.0.ZU;2-8

Abstract

Calculating the capacity and generalization capabilities of feed-forwa rd multilayer neural networks requires the use of replica-symmetry-bre aking methods, making the calculation practically unfeasible. Replica symmetry is broken because the configuration space is disconnected, wh ich is clearly the case in the capacity limit where the configuration space shrinks to isolated points. Moreover, there is no knowledge abou t the number of replica-symmetry-breaking steps required to obtain rel iable results. Novel approaches to tackle the capacity calculation of feed-forward neural networks avoiding the use of replica-symmetry-brea king methods are presented in this paper. The basic idea behind these approaches is that breaking explicit symmetries of the network prior t o the rapacity calculation itself restores order-parameter symmetry, a t least to a good approximation, and therefore enables the use of the replica-symmetry ansatz. Two methods are presented for breaking the ex plicit symmetries and restoring replica symmetry; one restricts relati ons between the various weight elements while the other restricts the values of the order parameters. These methods, which are demonstrated in this work via the capacity calculation of feed-forward neural netwo rks, are applicable to a variety of capacity, learning and generalizat ion capability calculations of such nets. We examine an approximation for carrying out the multi-dimensional Gaussian integrals appearing du ring the calculation as well as exact results for some simple cases. N umerical results obtained for nets with one to six hidden neurons usin g the downhill simplex and adaptive simulated-annealing optimization a lgorithms are in good agreement with simulation results.