Alignment algorithms can be used to infer a relationship between sequences
when the true relationship is unknown. Simple alignment algorithms use a co
st function that gives a fixed cost to each possible point mutation-mismatc
h, deletion, insertion. These algorithms tend to find optimal alignments th
at have many small gaps. It is more biologically plausible to have fewer lo
nger gaps rather than many small gaps in an alignment. To address this issu
e, linear gap cost algorithms are in common use for aligning biological seq
uence data. More reliable inferences are obtained by aligning more than two
sequences at a time. The obvious dynamic programming algorithm for optimal
ly aligning k sequences of length n runs in O(n(k)) time. This is impractic
al if k greater than or equal to 3 and n is of any reasonable length. Thus,
for this problem there are many heuristics for aligning k sequences, howev
er, they are not guaranteed to find an optimal alignment. In this paper, we
present a new algorithm guaranteed to find the optimal alignment for three
sequences using linear gap costs. This gives the same results as the dynam
ic programming algorithm for three sequences, but typically does so much mo
re quickly. It is particularly fast when the (three-way) edit distance is s
mall. Our algorithm uses a speed-up technique based on Ukkonen's greedy alg
orithm(Ukkonen, 1983) which he presented for two sequences and simple costs
. (C) 2000 Academic Press.