STATISTICAL POWER OF EDF TESTS OF NORMALITY AND THE SAMPLE-SIZE REQUIRED TO DISTINGUISH GEOMETRIC-NORMAL (LOGNORMAL) FROM ARITHMETIC-NORMALDISTRIBUTIONS OF LOW VARIABILITY
Pd. Gingerich, STATISTICAL POWER OF EDF TESTS OF NORMALITY AND THE SAMPLE-SIZE REQUIRED TO DISTINGUISH GEOMETRIC-NORMAL (LOGNORMAL) FROM ARITHMETIC-NORMALDISTRIBUTIONS OF LOW VARIABILITY, Journal of theoretical biology, 173(2), 1995, pp. 125-136
Many biological variables are distributed geometrically (proportionall
y) rather than arithmetically, and they are lognormal rather than norm
al on the usual arithmetic scale of measurement. The distinction is im
portant because it affects statistical interpretation at many levels:
for example, logarithmic transformations commonly used in biology to s
tandardize variances and linearize relationships of arithmetic measure
ments will skew underlying distributions if these are inherently arith
metic-normal but not if they are geometric-normal. The purpose of this
study is to determine theoretically, using Monte Carlo simulation, wh
ich of a range of recommended tests of normality has greatest power, a
nd what sample size is required to distinguish geometric-normal from a
rithmetic-normal distributions when inherent variability is low, as it
is in most biological distributions. Lilliefors' version of the Kolmo
gorov-Smirnov test, Frosini's test, and the Anderson-Darling test are
three non-parametric goodness-of-fit tests of normality based on an ob
served empirical distribution function. When inherent variability V is
on the order of 0.10 (standard deviation 10% of mean), Lilliefors' te
st requires a minimum sample of about 2200 to correctly distinguish lo
gnormal distributions from normal 95% of the time (with the level of s
ignificance or type I error rate alpha and the type II error rate beta
both 0.05). In the same situation, Frosini's test requires a minimum
sample of about 1700; the Anderson-Darling test is more powerful, but
still requires a minimum sample of about 1500. Power is sensitive to i
nherent variability: when V = 0.05 the Anderson-Darling test requires
a minimum sample much greater than 2500, but when V = 0.15 it requires
a minimum sample of only about 650. Sensitivity of the power of all t
ests to inherent variability means that the normality of body measurem
ents like weight with V typically similar to 0.15 is more easily teste
d than the normality of body lengths with V typically similar to 0.05
in the same sample. Inherent variability must be considered in designi
ng empirical tests of normality, and differences in inherent variabili
ty must be considered in interpreting results.