Retrospective selection bias (or the benefit of hindsight)

Authors
Citation
F. Mulargia, Retrospective selection bias (or the benefit of hindsight), GEOPHYS J I, 146(2), 2001, pp. 489-496
Citations number
36
Categorie Soggetti
Earth Sciences
Journal title
GEOPHYSICAL JOURNAL INTERNATIONAL
ISSN journal
0956540X → ACNP
Volume
146
Issue
2
Year of publication
2001
Pages
489 - 496
Database
ISI
SICI code
0956-540X(200108)146:2<489:RSB(TB>2.0.ZU;2-0
Abstract
The complexity of geophysical systems makes modelling them a formidable tas k, and in many cases research studies are still in the phenomenological sta ge. In earthquake physics, long timescales and the lack of any natural labo ratory restrict research to retrospective analysis of data. Such 'fishing e xpedition' approaches lead to optimal selection of data, albeit not always consciously. This introduces significant biases, which are capable of false ly representing simple statistical fluctuations as significant anomalies re quiring fundamental explanations. This paper identifies three different str ategies for discriminating real issues from artefacts generated retrospecti vely. The first attempts to identify ab initio each optimal choice and acco unt for it. Unfortunately, a satisfactory solution can only be achieved in particular cases. The second strategy acknowledges this difficulty as well as the unavoidable existence of bias, and classifies all 'anomalous' observ ations as artefacts unless their retrospective probability of occurrence is exceedingly low (for instance, beyond six standard deviations). However, s uch a strategy is also likely to reject some scientifically important anoma lies. The third strategy relies on two separate steps with learning and val idation performed on effectively independent sets of data. This approach ap pears to be preferable in the case of small samples, such as are frequently encountered in geophysics, but the requirement for forward validation impl ies long waiting times before credible conclusions can be reached. A practi cal application to pattern recognition, which is the prototype of retrospec tive 'fishing expeditions', is presented, illustrating that valid conclusio ns are hard to find.