Normal and lognormal data distribution in geochemistry: death of a myth. Consequences for the statistical treatment of geochemical and environmental data

Citation
C. Reimann et P. Filzmoser, Normal and lognormal data distribution in geochemistry: death of a myth. Consequences for the statistical treatment of geochemical and environmental data, ENVIR GEOL, 39(9), 2000, pp. 1001-1014
Citations number
38
Categorie Soggetti
Environment/Ecology
Journal title
ENVIRONMENTAL GEOLOGY
ISSN journal
09430105 → ACNP
Volume
39
Issue
9
Year of publication
2000
Pages
1001 - 1014
Database
ISI
SICI code
0943-0105(200007)39:9<1001:NALDDI>2.0.ZU;2-Q
Abstract
All variables of several large data sets from regional geochemical and envi ronmental surveys were tested for a normal or lognormal data distribution. As a general rule, almost all variable (up to more than 50 analysed chemica l elements per data set) show neither a normal or a lognormal data distribu tion. Even when different transformation methods are used more than 70% of all variables in every single data set do not approach a normal distributio n. Distributions are usually skewed, have outliers and originate from more than one process. When dealing with regional geochemical or environmental d ata normal and/or lognormal distributions are an exception and not the rule . This observation has serious consequences for the further statistical tre atment of geochemical and environmental data. The most widely used statisti cal methods are all based on the assumption that the studied data show a no rmal or lognormal distribution. Neglecting that geochemical and environment al data show neither a normal or lognormal distribution will lead to biased or faulty results when such techniques are used.