Simulation of random dendrograms and comparison tests: Some comments

Authors
Citation
J. Podani, Simulation of random dendrograms and comparison tests: Some comments, J CLASSIF, 17(1), 2000, pp. 123-142
Citations number
25
Categorie Soggetti
Library & Information Science
Journal title
JOURNAL OF CLASSIFICATION
ISSN journal
01764268 → ACNP
Volume
17
Issue
1
Year of publication
2000
Pages
123 - 142
Database
ISI
SICI code
0176-4268(2000)17:1<123:SORDAC>2.0.ZU;2-L
Abstract
It is shown that there is a simple, easily understood alternative to the do uble permutation algorithm for generating random, fully ranked dendrograms. The paper also examines the utility of five different dendrogram descripto rs in statistical analyses of dendrogram similarity. They serve as a logica l basis for comparisons under different simulation models: cophenetic diffe rence is valid for weighted dendrograms, partition membership divergence fo r fully ranked dendrograms, whereas subtree membership divergence and clust er membership divergence are best suited to partially ranked dendrograms. T he latter two descriptors possess the ultrametric property for all triples, but are called quasi-ultrametrics because they do not satisfy the identity axiom. The fifth descriptor considered is path difference which is not rec ommended for comparisons except for unrooted trees. Correlations among dend rogram descriptors are evaluated through simulation experiments, and it is shown that the significance of dendrogram comparisons is greatly influenced by the choice of the descriptor. The paper emphasizes that choice of the u nderlying tree distribution to be used as a reference in testing significan ce of a dendrogram comparison measure should be consistent with the descrip tor incorporated by that measure.