Postoperative cognitive function (POCD) has been subject to extensive resea
rch. In the literature, large differences are apparent in methodology such
as the test batteries, the interval between sessions, the endpoints to be a
nalysed, statistical methods, and how neuropsychological deficits are defin
ed. Traditionally, intelligence tests or tests developed for clinical neuro
psychology have been used. The tests for detecting POCD should be based on
well-described sensitivity and suitability in relation to surgical patients
. In tests using scores, floor/ceiling effects may compromise the evaluatio
n if the tests are either too easy or to difficult. Uncontrolled testing fa
cilities and change of test personnel may affect the test performance. Prac
tice effects are pronounced in neuropsychological tests but have generally
been ignored. The use of a suitable normative population is essential to al
low correction for practice effects and variability between sessions. Missi
ng follow-up may severely compromise valid conclusions since subjects unabl
e or unwilling to be examined are particularly prone to suffer from POCD. I
n the statistical analysis of the test results, the evaluation should be ba
sed on differences between pre- and postoperative performance. Parametric s
tatistical tests are not relevant unless the appropriate Gaussian distribut
ions are present, perhaps after transformation of data. The definition of c
ognitive dysfunction should be restrictive and the criteria should be fulfi
lled in only a small proportion of volunteers. In the literature, these req
uirements often have not been fulfilled. This precludes a reasonable estima
tion of the incidence of POCD and the conclusions of comparative studies sh
ould be interpreted with great caution. In this review article, we present
a number of recommendations for the design and execution of studies within
this area. In addition, the critical reader may use these recommendations i
n the evaluation of the literature.