EXPLICIT SYMMETRIES AND THE CAPACITY OF MULTILAYER NEURAL NETWORKS

Authors
Citation
D. Saad, EXPLICIT SYMMETRIES AND THE CAPACITY OF MULTILAYER NEURAL NETWORKS, Journal of physics. A, mathematical and general, 27(8), 1994, pp. 2719-2734
Citations number
18
Categorie Soggetti
Physics
ISSN journal
03054470
Volume
27
Issue
8
Year of publication
1994
Pages
2719 - 2734
Database
ISI
SICI code
0305-4470(1994)27:8<2719:ESATCO>2.0.ZU;2-8
Abstract
Calculating the capacity and generalization capabilities of feed-forwa rd multilayer neural networks requires the use of replica-symmetry-bre aking methods, making the calculation practically unfeasible. Replica symmetry is broken because the configuration space is disconnected, wh ich is clearly the case in the capacity limit where the configuration space shrinks to isolated points. Moreover, there is no knowledge abou t the number of replica-symmetry-breaking steps required to obtain rel iable results. Novel approaches to tackle the capacity calculation of feed-forward neural networks avoiding the use of replica-symmetry-brea king methods are presented in this paper. The basic idea behind these approaches is that breaking explicit symmetries of the network prior t o the rapacity calculation itself restores order-parameter symmetry, a t least to a good approximation, and therefore enables the use of the replica-symmetry ansatz. Two methods are presented for breaking the ex plicit symmetries and restoring replica symmetry; one restricts relati ons between the various weight elements while the other restricts the values of the order parameters. These methods, which are demonstrated in this work via the capacity calculation of feed-forward neural netwo rks, are applicable to a variety of capacity, learning and generalizat ion capability calculations of such nets. We examine an approximation for carrying out the multi-dimensional Gaussian integrals appearing du ring the calculation as well as exact results for some simple cases. N umerical results obtained for nets with one to six hidden neurons usin g the downhill simplex and adaptive simulated-annealing optimization a lgorithms are in good agreement with simulation results.