We challenge the notion that double data entry is either sufficient or
necessary to ensure good-quality data in clinical trials. Although we
do not completely reject that notion, we quantify some of the effects
that poor quality data have on final study results in terms of estima
tion, significance testing, and power. By introducing digit errors int
o simulated blood pressure measurements we demonstrate that simple ran
ge checks allow us to detect (and therefore correct) the main errors t
hat impact the final study results and conclusions. The errors that ca
nnot easily be detected by such range checks, although possibly numero
us, are shown to be of little importance in drawing the correct conclu
sions from the statistical analysis of data. Exploratory data analysis
cannot identify all errors that a second data entry would detect, but
on the other hand, not all errors that are found by exploratory data
analysis are detectable by double data entry. Double data entry is con
cerned solely with ensuring, to a high degree of certainty, that what
is recorded on the case record form is transcribed into the database.
Exploratory data analysis looks beyond the case record form to challen
ge the plausibility of the written data. In this sense, the second ent
ering of data has some benefit, but the use of exploratory data analys
is methods, either as data entry is ongoing or at the end of data entr
y and as the first stage in an analysis strategy, should always be man
datory. (C) Elsevier Science Inc. 1998.