Multiple sequence comparison is a basic problem for molecular biology and o
ther sciences. In this paper, we introduce the concept of complete informat
ion set and some measurement principles for measuring discrepancy among mul
tiple sequences. Based on them, we present a new measurement method satisfy
ing the principles for comparing multiple sequences. We illustrate that thi
s method can effectively distinguish different random sequences or DNA sequ
ences of length 8000 by comparisons of 6-8 symbol (base) strings or protein
sequences of length 8000 by comparisons of 3-4 symbol (amino acid) strings
. It can also measure slight changes of a sequence, e.g., insertion or dele
tion of a symbol (a base or an amino acid) in a sequence. It is applied in
the study of molecular evolution, and the elementary result shows a hierarc
hic relationship among the cytochrome C protein sequences of different spec
ies, much as that in taxonomy. (C) 2001 Elsevier Science Inc. All rights re
served.