Sequence data for a group of species is often summarized by a distance matr
ix M where M[s, t] is the dissimilarity between the sequences of species s
and t. An ordinal assertion is a statement of the form "species a and b are
as similar as species c and d" and is supported by distance matrix M if M[
a, b] less than or equal to M[c, d]. Recent preliminary research suggests t
hat ordinal assertions can be used to reconstruct the evolutionary history
of a group of species effectively. However, further research on the mathema
tical and algorithmic properties of ordinal assertions is needed to facilit
ate the development and assessment of inference methods that utilize ordina
l assertions for reconstructing evolutionary histories.
A (weighted) ordinal representation of a distance matrix M is a (weighted)
phylogeny T such that, for all species a, b, c, and d labeling T,
d(T) (a, b) less than or equal to d(T)(c, d) if and only if M[a, b] less th
an or equal to M[c, d],
where dr is the weighted path length when T is weighted. otherwise dr is th
e unweighted path length. Hence, an ordinal representation of M is a phylog
eny that supports the same ordinal assertions supported by M, and so is the
focus of our examination of the mathematical and algorithmic properties of
ordinal assertions.
As it turns out, ordinal representations are rich in structure. In this pap
er several results on weighted and unweighted ordinal representations are p
resented:
The unweighted ordinal representation of a distance matrix is unique. This
generalizes the well-known result that no two phylogenies share the same di
stance matrix [10], [21].
The unweighted ordinal representation of a distance matrix can be found in
O (n(2) log(2)(n)) time. The algorithm presented improves upon an O (n(3))
algorithm by Kannan and Warnow [13] that finds binary unweighted ordinal re
presentations of distance matrices.
Under certain conditions, weighted ordinal representations can be found in
polynomial time.