D. Greenhalgh et al., A partial ranking method for identifying repeated inclusion of individualsin anonymized HIV infection reports, BIOMETRICS, 55(1), 1999, pp. 165-173
Diagnoses of HIV infection are reported to the Public Health Laboratory Ser
vice (PHLS) AIDS Centre under a voluntary surveillance scheme. Names are no
t held in the data set, but the date of birth of the individual concerned i
s usually available. This paper describes a statistical method for identify
ing whether there are likely to be individuals repeatedly represented in th
e resulting data set, which is considered by birth year. A partial ordering
method is used that is especially useful for years where the number of bir
th years in the sample is too small for chi(2) tests to be used. At the 5%
level, one of, the five birth years tested in the data supplied to us by th
e PHLS shows evidence of more replication than would be expected from indep
endent random sampling from the population. The results are compared with a
n alternative maximum-likelihood-based test that reaches the same conclusio
ns. Maximum likelihood methods are further used to estimate the percentage
of overcounting of individuals in the sample at 2.7%.