Computer-intensive randomization in systematics

Authors
Citation
Me. Siddall, Computer-intensive randomization in systematics, CLADISTICS, 17(1), 2001, pp. S35-S52
Citations number
85
Categorie Soggetti
Biology
Journal title
CLADISTICS-THE INTERNATIONAL JOURNAL OF THE WILLI HENNIG SOCIETY
ISSN journal
07483007 → ACNP
Volume
17
Issue
1
Year of publication
2001
Part
2
Pages
S35 - S52
Database
ISI
SICI code
0748-3007(200103)17:1<S35:CRIS>2.0.ZU;2-Y
Abstract
There has been a sort of cottage industry in the development of randomizati on routines in systematics beginning with the bootstrap and the jackknife a nd, in a sense, culminating with various Monte Carlo routines that have bee n used to assess the performance of phylogenetic methods in limiting circum stances. These methods can be segregated into three basic areas of interest : measures of support such as bootstrap, jackknife, Permutation Tail Probab ility, T-PTP, and MoJo; measures of how well independent data are correlate d in a phylogenetic framework like PCP for coevolution and Manhattan Strati graphic Measure (MSM) for stratigraphy: and simulation-based Monte-Carlo me thods for ascertaining relative performance of optimality criteria or codin g methods. Although one approach to assessing cospeciation questions has be en the randomization of, for example, hosts and parasite trees, it is well established that in questions that are of a correlative type, the associati on themselves are what should be permuted. This has been applied to Brooks' parsimony analysis previously and here to the recent reconciled tree appro ach to these questions. Although it is debatable whether the extrinsic temp oral position of a fossil can stand as refutation of intrinsic morphologica l character-based cladograms, one can, nonetheless, determine the strength and significance of fit of stratigraphic data to a cladogram. The only meth od available in this regard that has been shown to not be biased by tree sh ape is the MSM and modifications of that. Another similar approach that is new is applied to evaluating the historical informativeness of base composi tion biases. Incongruence length difference tests too are essentially corre lative in nature and comparing the behavior of "perceived" partitions to ra ndomly determined partitions of the same size has become the standard for i nterpreting the relative conflict between differently acquired data. Unlike the foregoing, which make full use of the observed structure of the data, Monte Carlo methods require the input of parameters or of models and in tha t sense the results tend to be lacking in verisimilitude. Nonetheless, thes e kinds of questions seem to have been those most widely promulgated in our field. The well-established theoretical proposition that parsimony has pro blems with adjacent long-branches was of course illustrated through such me thods, much to the concern and angst of systematists. That likelihood later was shown to perform worse than parsimony when those long branches might r epel each other has generated less concern and angst. But then many such ci rcumstances can be divined, like the "short-branch-mess" problem wherein li kelihood has difficulty placing just a single long branch. Overall, then, i n the interpretation of these or any other Monte Carlo issues it will be im portant to critically examine the structure of the modeled process and the scope of inferences that can be drawn therefrom. Modeling situations that a re bound to yield results favorable to only one approach (such as unrealist ic even splitting of ancestral populations at unrealistically predictable t imes in examination of the coding of polymorphic data) should be viewed wit h great caution. More to the point, since history is singular and not repea table, the utility of statistical approaches may itself be dubious except i n very special circumstances-most of the requirements for stochasticity and independence can never be met. (C) 2001 The Willi Hennig Society.