HIGH-BREAKDOWN LINEAR DISCRIMINANT-ANALYSIS

Citation
Dm. Hawkins et Gj. Mclachlan, HIGH-BREAKDOWN LINEAR DISCRIMINANT-ANALYSIS, Journal of the American Statistical Association, 92(437), 1997, pp. 136-143
Citations number
16
Categorie Soggetti
Statistic & Probability","Statistic & Probability
Volume
92
Issue
437
Year of publication
1997
Pages
136 - 143
Database
ISI
SICI code
Abstract
The classification rules of linear discriminant analysis are defined b y the true mean vectors and the common covariance matrix of the popula tions from which the data come. Because these true parameters are gene rally unknown, they are commonly estimated by the sample mean vector a nd covariance matrix of the data in a training sample randomly drawn f rom each population. However, these sample statistics are notoriously susceptible to contamination by outliers, a problem compounded by the fact that the outliers may be invisible to conventional diagnostics. H igh-breakdown estimation is a procedure designed to remove this cause for concern by producing estimates that are immune to serious distorti on by a minority of outliers, regardless of their severity. In this ar ticle we motivate and develop a high-breakdown criterion for linear di scriminant analysis and give an algorithm for its implementation. The procedure is intended to supplement rather than replace the usual samp le-moment methodology of discriminant analysis either by providing ind ications that the dataset is not seriously affected by outliers (suppo rting the usual analysis) or by identifying apparently aberrant points and giving resistant estimators that are not affected by them.