M. Vingron et Pr. Sibbald, WEIGHTING IN SEQUENCE SPACE - A COMPARISON OF METHODS IN TERMS OF GENERALIZED SEQUENCES, Proceedings of the National Academy of Sciences of the United Statesof America, 90(19), 1993, pp. 8777-8781
Four methods for weighting aligned biological sequences have recently
appeared that differ mathematically, philosophically, and in their res
ults. Thus, while there is consensus about the need to weight sequence
s, the method to use is contentious. A geometric analysis based on a c
ontinuous sequence space is presented that provides a common framework
in which to compare the methods. It is concluded that there are two '
'best'' methods. When the sequences are known to be phylogenetically r
elated and a tree can be generated without introducing excessive stres
s into the data, the method of Altschul et al. [Altschul, S. F., Carro
ll, R. J. & Lipman, D. J. (1989) J. Mol. Biol. 207, 647-653] is approp
riate. When the sequences are not known to be phylogenetically related
or a tree cannot be produced without unduly distorting the distances
between the sequences, a modification of the method of Sibbald and Arg
os [Sibbald, P. R. & Argos, P. (1990) J. Mol. Biol. 216, 813-818] is p
referable.