MULTILAYER NEURAL NETWORKS FOR REDUCED-RANK APPROXIMATION

Citation
Ki. Diamantaras et Sy. Kung, MULTILAYER NEURAL NETWORKS FOR REDUCED-RANK APPROXIMATION, IEEE transactions on neural networks, 5(5), 1994, pp. 684-697
Citations number
21
Categorie Soggetti
Computer Application, Chemistry & Engineering","Engineering, Eletrical & Electronic","Computer Science Artificial Intelligence","Computer Science Hardware & Architecture","Computer Science Theory & Methods
ISSN journal
10459227
Volume
5
Issue
5
Year of publication
1994
Pages
684 - 697
Database
ISI
SICI code
1045-9227(1994)5:5<684:MNNFRA>2.0.ZU;2-2
Abstract
This paper is developed in two parts. First, we formulate the solution to the general reduced-rank linear approximation problem relaxing the invertibility assumption of the input autocorrelation matrix used by previous authors. Our treatment unifies linear regression, Wiener filt ering, full rank approximation, auto-association networks, SVD and Pri ncipal Component Analysis (PCA) as special cases. Our analysis also sh ows that two-layer linear neural networks with reduced number of hidde n units, trained with the least-squares error criterion, produce weigh ts that correspond to the Generalized Singular Value Decomposition of the input-teacher cross-correlation matrix and the input data matrix. As a corollary the linear two-layer back propagation model with reduce d hidden layer extracts an arbitrary linear combination of the general ized singular vector components. Second, we investigate artificial neu ral network models for the solution of the related generalized eigenva lue problem. By introducing and utilizing the extended concept of defl ation (originally proposed for the standard eigenvalue problem) we are able to find that a sequential version of linear BP can extract the e xact generalized eigenvector components. The advantage of this approac h is that it's easier to update the model structure by adding one more unit or pruning one or more units when our application requires it. A n alternative approach for extracting the exact components is to use a set of lateral connections among the hidden units trained in such a w ay as to enforce orthogonality among the upper- and lower-layer weight s. We shall call this the Lateral Orthogonalization Network (LON) and we'll show via theoretical analysis-and verify via simulation-that the network extracts the desired components. The advantage of the LON-bas ed model is that it can be applied in a parallel fashion so that the c omponents are extracted concurrently. Finally, we show the application of our results to the solution of the identification problem of syste ms whose excitation has non-invertible autocorrelation matrix. Previou s identification methods usually rely on the invertibility assumption of the input autocorrelation, therefore they can not be applied to thi s case.