ITA
ENG

ASPECTS OF PSEUDORANK ESTIMATION METHODS BASED ON THE EIGENVALUES OF PRINCIPAL COMPONENT ANALYSIS OF RANDOM MATRICES

Authors

FABER NM BUYDENS LMC KATEMAN G

Citation

Nm. Faber et al., ASPECTS OF PSEUDORANK ESTIMATION METHODS BASED ON THE EIGENVALUES OF PRINCIPAL COMPONENT ANALYSIS OF RANDOM MATRICES, Chemometrics and intelligent laboratory systems, 25(2), 1994, pp. 203-226

Citations number

Categorie Soggetti

Computer Application, Chemistry & Engineering","Instument & Instrumentation","Chemistry Analytical","Computer Science Artificial Intelligence","Robotics & Automatic Control

Journal title

Chemometrics and intelligent laboratory systems → ACNP

ISSN journal

01697439

Volume

Issue

Year of publication

1994

Pages

203 - 226

Database

ISI

SICI code

0169-7439(1994)25:2<203:AOPEMB>2.0.ZU;2-B

Abstract

Nowadays, analytical instruments that produce a data matrix for one ch emical sample enjoy a widespread popularity. However, for a successful analysis of these data an accurate estimate of the pseudorank of the matrix is often a crucial prerequisite. A large number of methods for estimating the pseudorank are based on the eigenvalues obtained from p rincipal component analysis (PCA). In this paper methods are discussed that exploit the essential similarity between the residuals of PCA of the test data matrix and the elements of a random matrix. In the lite rature of PCA these methods are commonly denoted as parallel analysis. Attention is paid to several aspects that have to be considered when applying such methods. For some of these aspects asymptotic results ca n be found in the statistical literature. In this study Monte Carlo si mulations are used to investigate the practical implications of these theoretical results. It is shown that for sufficiently large matrices the distribution of the measurement error does not significantly influ ence the results. Down to a very small signal-to-noise ratio the ratio of the number of rows and the number of columns constitutes the major influence on the expected value of the eigenvalues associated with th e residuals. The consequences are illustrated for two functions of the eigenvalues, i.e. the logarithm of the eigenvalues and Malinowski's r educed eigenvalues. Both methods are graphical and have been applied i n the past with considerable success for a variety of data. Malinowski 's reduced eigenvalues are of special interest since they have been us ed to construct an F-test. Finally, a modification is proposed for pse udorank estimation methods that are based on the principle of parallel analysis.