Knowledge of the pattern of nucleotide substitution is important both
to our understanding of molecular sequence evolution and to reliable e
stimation of phylogenetic relationships. The method of parsimony analy
sis, which has been used to estimate substitution patterns in real seq
uences, has serious drawbacks and leads to results difficult to interp
ret. In this paper a model-based maximum likelihood approach is propos
ed for estimating substitution patterns in real sequences. Nucleotide
substitution is assumed to follow a homogeneous Markov process, and th
e general reversible process model (REV) and the unrestricted model wi
thout the reversibility assumption are used. These models are also app
lied to examine the adequacy of the model of Hasegawa et al. (J. Mol.
Evol, 1985;22:160-174) (HKY85). Two data sets are analyzed. Far the ps
i eta-globin pseudogenes of six primate species, the REV model fits th
e data much better than HKY85, while, for a segment of mtDNA sequences
from nine primates, REV cannot provide a significantly better fit tha
n HKY85 when rate variation over sites is taken into account in the mo
dels. It is concluded that the use of the REV model in phylogenetic an
alysis can be recommended, especially for large data sets or for seque
nces with extreme substitution patterns, while HKY85 may be expected t
o provide a good approximation. The use of the unrestricted model does
not appear to be worthwhile.