Y. Vandepeer et al., RECONSTRUCTING EVOLUTION FROM EUKARYOTIC SMALL-RIBOSOMAL-SUBUNIT RNA SEQUENCES - CALIBRATION OF THE MOLECULAR CLOCK, Journal of molecular evolution, 37(2), 1993, pp. 221-232
The detailed descriptions now available for the secondary structure of
small-ribosomal-subunit RNA, including areas of higly variable primar
y structure, facilitate the alignment of nucleotide sequences. However
, for optimal exploitation of the information contained in the alignme
nt, a method must be available that takes into account the local seque
nce variability in the computation of evolutionary distance. A quantit
ative definition for the variability of an alignment position is propo
sed in this study. It is a parameter in an equation which expresses th
e probability that the alignment position contains a different nucleot
ide in two sequences, as a function of the distance separating these s
equences, i.e., the number of substitutions per nucleotide that occurr
ed during their divergence. This parameter can be estimated from the d
istance matrix resulting from the conversion of pairwise sequence diss
imilarities into pairwise distances. Alignment positions can then be s
ubdivided into a number of sets of matching variability, and the avera
ge variability of each set can be derived. Next, the conversion of dis
similarity into distance can be recalculated for each set of alignment
positions separately, using a modified version of the equation that c
orrects for multiple substitutions and changing for each set the param
eter that reflects its average variability. The distances computed for
each set are finally averaged, giving a more precise distance estimat
ion. Trees constructed by the algorithm based on variability calibrati
on have a topology markedly different from that of trees constructed f
rom the same alignments in the absence of calibration. This is illustr
ated by means of trees constructed from small-ribosomal-subunit RNA se
quences of Metazoa. A reconstruction of vertebrate evolution based on
calibrated alignments matches the consensus view of paleontologists, c
ontrary to trees based on uncalibrated alignments. In trees derived fr
om sequences covering several metazoan phyla, artefacts in topology th
at are probably due to a high clock rate in certain lineages are avoid
ed.