We have examined a number of statistical issues associated with method
s for evaluating different tests of density dependence. The lack of de
finitive standards and benchmarks for conducting simulation studies ma
kes it difficult to assess the performance of various tests. The biolo
gical researcher has a bewildering choice of statistical tests for tes
ting density dependence and the list is growing. The most recent addit
ions have been based on computationally intensive methods such as perm
utation tests and bootstrapping. We believe the computational effort a
nd time involved will preclude their widespread adoption until: (I) th
ese methods have been fully explored under a wide range of conditions
and shown to be demonstrably superior than other, simpler methods, and
(2) general purpose software is made available for performing the cal
culations. We have advocated the use of Bulmer's (first) test as a de
facto standard for comparative studies on the grounds of its simplicit
y, applicability, and satisfactory performance under a variety of cond
itions. We show that, in terms of power, Bulmer's test is robust to ce
rtain departures from normality although, as noted by other authors, i
t is affected by temporal trends in the data. We are not convinced tha
t the reported differences in power between Bulmer's test and the rand
omisation test of Pollard et al. (1987) justifies the adoption of the
latter. Nor do we believe a compelling case has been established for t
he parametric bootstrap likelihood ratio test of Dennis and Taper (199
4), Bulmer's test is essentially a test of the serial correlation in t
he (log) abundance data and is affected by the presence of autocorrela
ted errors. In such cases the test cannot distinguish between the auto
regressive effect in the errors and a true density dependent effect in
the time series data, We suspect other tests may be similarly affecte
d, although this is an area for further research. We have also noted t
hat in the presence of autocorrelation, the type I error rates can be
substantially different from the assumed level of significance, implyi
ng that in such cases the test is based on a faulty significance regio
n. We have indicated both qualitatively and quantitatively how autoreg
ressive error terms can affect the power of Bulmer's test, although we
suggest that more work is required in this area. These apparent inade
quacies of Bulmer's test should not be interpreted as a failure of the
statistical procedure since the test was not intended to be used with
autocorrelated error terms.