Detection of single influential points in OLS regression model building

Citation
M. Meloun et J. Militky, Detection of single influential points in OLS regression model building, ANALYT CHIM, 439(2), 2001, pp. 169-191
Citations number
91
Categorie Soggetti
Spectroscopy /Instrumentation/Analytical Sciences
Journal title
ANALYTICA CHIMICA ACTA
ISSN journal
00032670 → ACNP
Volume
439
Issue
2
Year of publication
2001
Pages
169 - 191
Database
ISI
SICI code
0003-2670(20010725)439:2<169:DOSIPI>2.0.ZU;2-A
Abstract
Identifying outliers and high-leverage points is a fundamental step in the least-squares regression model building process. Various influence measures based on different motivational arguments, and designed to measure the inf luence of observations on different aspects of various regression results, are elucidated and critiqued here. On the basis of a statistical analysis o f the residuals (classical, normalized, standardized, jackknife, predicted and recursive) and diagonal elements of a projection matrix, diagnostic plo ts for influential points indication are formed. Regression diagnostics do not require a knowledge of an alternative hypothesis for testing, or the fu lfillment of the other assumptions of classical statistical tests. In the i nteractive, PC-assisted diagnosis of data, models and estimation methods, t he examination of data quality involves the detection of influential points , outliers and high-leverages, which cause many problems in regression anal ysis. This paper provides a basic survey of the influence statistics of sin gle cases combining exploratory analysis of all variables. The graphical ai ds to the identification of outliers and high-leverage points are combined with graphs for the identification of influence type based on the likelihoo d distance. All these graphically oriented techniques are suitable for the rapid estimation of influential points, but are generally incapable of solv ing problems with masking and swamping. The powerful procedure for the comp utation of influential points characteristics has been written in Matlab 5. 3 and is available from authors. (C) 2001 Elsevier Science B.V. All rights reserved.