ITA
ENG

Retrospective selection bias (or the benefit of hindsight)

Authors

Mulargia, F

Citation

F. Mulargia, Retrospective selection bias (or the benefit of hindsight), GEOPHYS J I, 146(2), 2001, pp. 489-496

Citations number

Categorie Soggetti

Earth Sciences

Journal title

GEOPHYSICAL JOURNAL INTERNATIONAL

ISSN journal

0956540X → ACNP

Volume

146

Issue

Year of publication

2001

Pages

489 - 496

Database

ISI

SICI code

0956-540X(200108)146:2<489:RSB(TB>2.0.ZU;2-0

Abstract

The complexity of geophysical systems makes modelling them a formidable tas k, and in many cases research studies are still in the phenomenological sta ge. In earthquake physics, long timescales and the lack of any natural labo ratory restrict research to retrospective analysis of data. Such 'fishing e xpedition' approaches lead to optimal selection of data, albeit not always consciously. This introduces significant biases, which are capable of false ly representing simple statistical fluctuations as significant anomalies re quiring fundamental explanations. This paper identifies three different str ategies for discriminating real issues from artefacts generated retrospecti vely. The first attempts to identify ab initio each optimal choice and acco unt for it. Unfortunately, a satisfactory solution can only be achieved in particular cases. The second strategy acknowledges this difficulty as well as the unavoidable existence of bias, and classifies all 'anomalous' observ ations as artefacts unless their retrospective probability of occurrence is exceedingly low (for instance, beyond six standard deviations). However, s uch a strategy is also likely to reject some scientifically important anoma lies. The third strategy relies on two separate steps with learning and val idation performed on effectively independent sets of data. This approach ap pears to be preferable in the case of small samples, such as are frequently encountered in geophysics, but the requirement for forward validation impl ies long waiting times before credible conclusions can be reached. A practi cal application to pattern recognition, which is the prototype of retrospec tive 'fishing expeditions', is presented, illustrating that valid conclusio ns are hard to find.