APPLICATIONS OF MAXIMUM-LIKELIHOOD PRINCIPAL COMPONENT ANALYSIS - INCOMPLETE DATA SETS AND CALIBRATION TRANSFER

Citation
Dt. Andrews et Pd. Wentzell, APPLICATIONS OF MAXIMUM-LIKELIHOOD PRINCIPAL COMPONENT ANALYSIS - INCOMPLETE DATA SETS AND CALIBRATION TRANSFER, Analytica chimica acta, 350(3), 1997, pp. 341-352
Citations number
21
Categorie Soggetti
Chemistry Analytical
Journal title
ISSN journal
00032670
Volume
350
Issue
3
Year of publication
1997
Pages
341 - 352
Database
ISI
SICI code
0003-2670(1997)350:3<341:AOMPCA>2.0.ZU;2-T
Abstract
The application of a new method to the multivariate analysis of incomp lete data sets is described. The new method, called maximum likelihood principal component analysis (MLPCA), is analogous to conventional pr incipal component analysis (PCA), but incorporates measurement error v ariance information in the decomposition of multivariate data. Missing measurements can be handled in a reliable and simple manner by assign ing large measurement uncertainties to them. The problem of missing da ta is pervasive in chemistry, and MLPCA is applied to three sets of ex perimental data to illustrate its utility. For exploratory data analys is, a data set from the analysis of archeological artifacts is used to show that the principal components extracted by MLPCA retain much of the original information even when a significant number of measurement s are missing. Maximum likelihood projections of censored data can oft en preserve original clusters among the samples and can, through the p ropagation of error, indicate which samples are likely to be projected erroneously. To demonstrate its utility in modeling applications, MLP CA is also applied in the development of a model for chromatographic r etention based on a data set which is only 80% complete. MLPCA can pre dict missing values and assign error estimates to these points. Finall y, the problem of calibration transfer between instruments can be rega rded as a missing data problem in which entire spectra are missing on the 'slave' instrument. Using NIR spectra obtained from two instrument s, it is shown that spectra on the slave instrument can be predicted f rom a small subset of calibration transfer samples even if a different wavelength range is employed. Concentration prediction errors obtaine d by this approach were comparable to cross-validation errors obtained for the slave instrument when all spectra were available.