The eigenstructure of the second-order statistics of a multivariate random
population ran he inferred from the matrix of pairwise combinations of inne
r products of the samples, Therefore, it can be also efficiently obtained i
n the implicit, high-dimensional feature spaces defined by kernel functions
, We elaborate on this property to obtain general expressions for immediate
derivation of nonlinear counterparts of a number of standard pattern analy
sis algorithms, including principal component analysis, data compression an
d denoising, and Fisher's discriminant, The connection between kernel metho
ds and nonparametric density estimation is also illustrated, Using these re
sults we introduce the kernel version of Mahalanobis distance, which origin
ates nonparametric models with unexpected and interesting properties, and a
lso propose a kernel version of the minimum squared error (MSE) linear disc
riminant function, This learning machine is particularly simple and include
s a number of generalized linear models such as the potential functions met
hod or the radial basis function (RBF) network, Our results shed some light
on the relative merit of feature spaces and inductive bias in the remarkab
le generalization properties of the support vector machine (SVM), Although
in most situations the SVM obtains the lowest error rates, exhaustive exper
iments with synthetic and natural data show that simple kernel machines bas
ed on pseudoinversion are competitive in problems with appreciable class ov
erlapping.