J. Kececioglu et D. Sankoff, EXACT AND APPROXIMATION ALGORITHMS FOR SORTING BY REVERSALS, WITH APPLICATION TO GENOME REARRANGEMENT, Algorithmica, 13(1-2), 1995, pp. 180-210
Motivated by the problem in computational biology of reconstructing th
e series of chromosome inversions by which one organism evolved from a
nother, we consider the problem of computing the shortest series of re
versals that transform one permutation to another. The permutations de
scribe the order of genes on corresponding chromosomes, and a reversal
takes an arbitrary substring of elements, and reverses their order. F
or this problem, we develop two algorithms: a greedy approximation alg
orithm, that finds a solution provably close to optimal in O(n(2)) tim
e and O(n) space for n-element permutations, and a branch-and-bound ex
act algorithm, that finds an optimal solution in O(mL(n, n)) time and
O(n(2)) space, where m is the size of the branch-and-bound search tree
, and L(n, n) is the time tc, solve a linear program of n variables an
d n constraints. The greedy algorithm is the first to come within a co
nstant factor of the optimum; it guarantees a solution that uses no mo
re than twice the minimum number of reversals. The lower and upper bou
nds of the branch-and-bound algorithm are a novel application of maxim
um-weight matchings, shortest paths, and linear programming. In a seri
es of experiments, we study the performance of an implementation on ra
ndom permutations, and permutations generated by random reversals. For
permutations differing by k random reversals, we find that the averag
e upper bound on reversal distance estimates k to within one reversal
for k < 1/2n and n less than or equal to 100. For the difficult case o
f random permutations, we find that the average difference between the
upper and lower bounds is less than three reversals for n less than o
r equal to 50. Due to the tightness of these bounds, we can solve, to
optimality, problems on 30 elements in a few minutes of computer time.
This approaches the scale of mitochondrial genomes.