Fj. Lapointe et Jaw. Kirsch, ESTIMATING PHYLOGENIES FROM LACUNOSE DISTANCE MATRICES, WITH SPECIAL REFERENCE TO DNA HYBRIDIZATION DATA, Molecular biology and evolution, 12(2), 1995, pp. 266-284
Distance methods for producing phylogenies require n(2) comparisons am
ong n taxa to generate a complete matrix. Moreover, techniques for gen
erating distances-such as DNA hybridization-are subject to both system
atic and random experimental errors, so that the measurements do not s
atisfy the mathematical properties of distances. We have explored the
possibility of reconstructing trees from incomplete data. In our simul
ations, we discard one or both of reciprocal pairs from a complete mat
rix, estimate these values, reconstruct a tree, and compare the topolo
gy and branch lengths of the estimated tree with the phylogeny based o
n complete data. We investigated separately and jointly the effects of
rate variation and random and systematic errors, added to a fabricate
d ultrametric matrix, and then passed on to simulation experiments wit
h several complete DNA hybridization matrices. Our empirical results s
how that topological and metric recovery is always very good provided
no terminal sister taxa lack both reciprocal measurements or extremely
short internodes are involved. We then present two applications of th
e method for estimating phylogenies from incomplete DNA hybridization
matrices-the first illustrating reconstruction of a matrix with about
27% of missing cells, and the second suturing two matrices where some
data are held in common but 29% are missing from the combined table. T
hus, considerable information may be implicit in very sparse matrices,
and this circumstance has practical consequences for distance studies
when money, material, or time are limited.