VAPNIK-CHERVONENKIS DIMENSION OF RECURRENT NEURAL NETWORKS

Citation
P. Koiran et Ed. Sontag, VAPNIK-CHERVONENKIS DIMENSION OF RECURRENT NEURAL NETWORKS, Discrete applied mathematics, 86(1), 1998, pp. 63-79
Citations number
17
Categorie Soggetti
Mathematics,Mathematics
Volume
86
Issue
1
Year of publication
1998
Pages
63 - 79
Database
ISI
SICI code
Abstract
Most of the work on the Vapnik-Chervonenkis dimension of neural networ ks has been focused on feedforward networks. However, recurrent networ ks are also widely used in learning applications, in particular when t ime is a relevant parameter. This paper provides lower and upper bound s for the VC dimension of such networks. Several types of activation f unctions are discussed, including threshold, polynomial, piecewise-pol ynomial and sigmoidal functions. The bounds depend on two independent parameters: the number w of weights in the network, and the length k o f the input sequence. In contrast, for feedforward networks, VC dimens ion bounds can be expressed as a function of w only. An important diff erence between recurrent and feedforward nets is that a fixed recurren t net can receive inputs of arbitrary length. Therefore we are particu larly interested in the case k >> w. Ignoring multiplicative constants , the main results say roughly the following: For architectures with a ctivation sigma = any fixed nonlinear polynomial, the VC dimension is approximate to wk. For architectures with activation sigma = any fixed piecewise polynomial, the VC dimension is between wk and W(2)k. For a rchitectures with activation sigma = H (threshold nets), the VC dimens ion is between w log(k/w) and min{wk log wk, w(2) + w log wk}. For the standard sigmoid sigma(x)-1/(1 + e(-x)), the VC dimension is between wk and x(4)k(2). An earlier version of this paper has appeared in Proc . 3rd European Workshop on Computational Learning Theory, Lecture Note s in Computer Science vol. 1208, Springer, Berlin, 1997, pp. 223-237. (C) 1998 Elsevier Science B.V. All rights reserved.