Pj. Lockhart et al., RECOVERING EVOLUTIONARY TREES UNDER A MORE REALISTIC MODEL OF SEQUENCE EVOLUTION, Molecular biology and evolution, 11(4), 1994, pp. 605-612
We report a new transformation, the LogDet, that is consistent for seq
uences with differing nucleotide composition and that have arisen unde
r simple but asymmetric stochastic models of evolution. This transform
ation is required because existing methods tend to group sequences on
the basis of their nucleotide composition, irrespective of their evolu
tionary history. This effect of differing nucleotide frequencies is il
lustrated by using a tree-selection criterion on a simple distance mea
sure defined solely on the basis of base composition, independent of t
he actual sequences. The new LogDet transformation uses determinants o
f the observed divergence matrices and works because multiplication of
determinants (real numbers) is commutative, whereas multiplication of
matrices is not, except in special symmetric cases. The use of determ
inants thus allows more general models of evolution with asymmetric ra
tes of nucleotide change. The transformation is illustrated on a theor
etical data set (where existing methods select the wrong tree) and wit
h three biological data sets: chloroplasts, birds/mammals (nuclear), a
nd honeybees (mitochondrial). The LogDet transformation reinforces the
logical distinction between transformations on the data and tree-sele
ction criteria. The overall conclusions from this study are that irreg
ular A,C,G,T compositions are an important and possible general cause
of patterns that can mislead tree-reconstruction methods, even when hi
gh bootstrap values are obtained. Consequently, many published studies
may need to be reexamined.