B. Johnson et al., THE USE OF REGRESSION EQUATIONS FOR QUALITY-CONTROL IN A PESTICIDE PHYSICAL PROPERTY DATABASE, Environmental management, 19(1), 1995, pp. 127-134
Quality control is a crucial aspect of database management, particular
ly for physicochemical parameters that are widely used in modeling env
ironmental fate processes. Complete rechecking of original studies to
verify environmental fate parameters is time consuming and difficult.
This paper evaluates an alternative, more efficient approach to identi
fying database errors. The approach focuses verification efforts on a
targeted subset of entries by making use of the relationship between w
ater solubility (S) and soil organic carbon partition coefficient (K-o
c). Two regression equations, one selected from the literature and one
calculated from entries in the database, were used to evaluate the re
asonableness of (S, K-oc) pairs among control compared to the targeted
outlier group from a total of 59 pesticides. Our hypothesis was that
(S, K-oc) pairs that lay far from the regression line were more likely
to be in error than those that fit the regression. Database values we
re checked against original studies. Identified errors in the database
included coding mistakes, miscalculations, and incorrect chemical ide
ntification codes. The error rate in outlier, (S, K-oc) pairs was abou
t twice that of pairs that conformed to the regression equation; howev
er, the error rate differential was probably not large enough to justi
fy the use oi this quality control method. Through our close scrutiny
of database entries we were able to identify administrative practices
that led to mistakes in the data base. Resolution of these problems wi
ll significantly decrease the number of future mistakes.