1. For biological community data (species-by-sample abundance matrices), Wa
rwick & Clarke (1995) defined two biodiversity indices, capturing the struc
ture not only of the distribution of abundances amongst species but also th
e taxonomic relatedness of the species in each sample. The first index. tax
onomic diversity (Delta), can be thought of as the average taxonomic 'dista
nce' between any two organisms, chosen at random from the sample: this dist
ance can be visualized simply as the length of the path connecting these tw
o organisms, traced through (say) a Linnean or phylogenetic classification
of the full set of species involved. The second index, taxonomic distinctne
ss (Delta*), is the average path length between any two randomly chosen ind
ividuals, conditional on them being from different species. This is equival
ent to dividing taxonomic diversity, Delta, by the value it would take were
there to be no taxonomic hierarchy tall species belonging to the same genu
s). Delta* can therefore be seen as a measure of pure taxonomic relatedness
. whereas Delta mixes taxonomic relatedness with the evenness properties of
the abundance distribution.
2. This paper explores the statistical sampling properties of Delta and Del
ta*. Taxonomic diversity is seen to be a natural extension of a form of Sim
pson's index, incorporating taxonomic (or phylogenetic) information. Import
antly for practical comparisons, both Delta and Delta* are shown not to be
dependent, on average, on the degree of sampling effort involved in the dat
a collection; this is in sharp contrast with those diversity measures that
are strongly influenced by the number of observed species.
3. The special case where the data consist only of presence/absence informa
tion is dealt with in detail: Delta and Delta* converge to the same statist
ic (Delta(+)), which is now defined as the average taxonomic path length be
tween any two randomly chosen species. Its lack of dependence, in mean valu
e, on sampling effort implies that Delta(+) can be compared across studies
with differing and uncontrolled degrees of sampling effort (subject to assu
mptions concerning comparable taxonomic accuracy). This may be of particula
r significance for historic (diffusely collected) species lists from differ
ent localities or regions, which at first sight may seem unamenable to vali
d diversity comparison of any sort.
4. Furthermore, a randomization test is possible, to detect a difference in
the taxonomic distinctness, for any observed set of species. from the 'exp
ected' Delta(+) value derived from a master species list for the relevant g
roup of organisms. The exact randomization procedure requires heavy computa
tion, and an approximation is developed, by deriving an appropriate varianc
e formula. This leads to a 'confidence funnel' against which distinctness v
alues for any specific area, pollution condition, habitat type, etc., can b
e checked, and formally addresses the question of whether a putatively impa
cted locality has a 'lower than expected' taxonomic spread. The procedure i
s illustrated for the UK species list of free-living marine nematodes and s
ets of samples from intertidal sites in two localities, the Exe estuary and
the Firth of Clyde.