DISCRIMINANT-ANALYSIS WITH SINGULAR COVARIANCE MATRICES - A METHOD INCORPORATING CROSS-VALIDATION AND EFFICIENT RANDOMIZED PERMUTATION TESTS

Citation
P. Jonathan et al., DISCRIMINANT-ANALYSIS WITH SINGULAR COVARIANCE MATRICES - A METHOD INCORPORATING CROSS-VALIDATION AND EFFICIENT RANDOMIZED PERMUTATION TESTS, Journal of chemometrics, 10(3), 1996, pp. 189-213
Citations number
25
Categorie Soggetti
Chemistry Analytical","Statistic & Probability
Journal title
ISSN journal
08869383
Volume
10
Issue
3
Year of publication
1996
Pages
189 - 213
Database
ISI
SICI code
0886-9383(1996)10:3<189:DWSCM->2.0.ZU;2-4
Abstract
A computationally efficient approach has been developed to perform two -group linear discriminant analysis using high-dimensional data. The a nalysis is based on Fisher's method and incorporates two important val idation stages: 1, full leave-one-observation-out cross-validation; 2, randomized permutation distribution testing. The resulting algorithm and software are known as CREDIT (cross-validated random-permutation-t ested efficient discrimination based on an adjusted generalized invers e for the sample total covariance matrix). The algorithm has been impl emented in the SAS/IML matrix programming language and provides dramat ic improvements in computational efficiency compared with existing sof tware for discriminant analysis incorporating validation stages 1 and 2 above. Application of CREDIT to nine multivariate data sets indicate s that the predictive performance of the approach, assessed using cros s-validation, is comparable with that of other methods for discriminan t analysis. Comparisons with two specific methods are included. Random ized permutation tests show that success rates using the true response classes are almost always better than success rates using random perm utations of the classes. This gives confidence that there is a useful linear discriminant relationship present in the data being analysed. F or a randomly selected training set (used to construct the discriminan t rule) the success rates for CREDIT are unbiased predictive success r ates for allocating other observations to groups. Predicting group mem berships for future observations using any discriminant model based on singular estimates of covariance matrices must be performed with grea t care. A discussion of methods to test the concordance of future obse rvations with the training set is given.