The proposal of considering nonlinear principal component analysis as a ker
nel eigenvalue problem has provided an extremely powerful method of extract
ing nonlinear features for a number of classification and regression applic
ations. Whereas the utilization of Mercer kernels makes the problem of comp
uting principal components in, possibly, infinite-demensional feature space
s tractable, there are still the attendant numerical problems of diagonaliz
ing large matrices. In this contribution, we propose an expectation-maximiz
ation approach for performing kernel principal component analysis and show
this to be a computationally efficient method, especially when the number o
f data points is large.