EXACT AND APPROXIMATION ALGORITHMS FOR SORTING BY REVERSALS, WITH APPLICATION TO GENOME REARRANGEMENT

Citation
J. Kececioglu et D. Sankoff, EXACT AND APPROXIMATION ALGORITHMS FOR SORTING BY REVERSALS, WITH APPLICATION TO GENOME REARRANGEMENT, Algorithmica, 13(1-2), 1995, pp. 180-210
Citations number
28
Categorie Soggetti
Computer Sciences",Mathematics,Mathematics,"Computer Science Software Graphycs Programming
Journal title
ISSN journal
01784617
Volume
13
Issue
1-2
Year of publication
1995
Pages
180 - 210
Database
ISI
SICI code
0178-4617(1995)13:1-2<180:EAAAFS>2.0.ZU;2-X
Abstract
Motivated by the problem in computational biology of reconstructing th e series of chromosome inversions by which one organism evolved from a nother, we consider the problem of computing the shortest series of re versals that transform one permutation to another. The permutations de scribe the order of genes on corresponding chromosomes, and a reversal takes an arbitrary substring of elements, and reverses their order. F or this problem, we develop two algorithms: a greedy approximation alg orithm, that finds a solution provably close to optimal in O(n(2)) tim e and O(n) space for n-element permutations, and a branch-and-bound ex act algorithm, that finds an optimal solution in O(mL(n, n)) time and O(n(2)) space, where m is the size of the branch-and-bound search tree , and L(n, n) is the time tc, solve a linear program of n variables an d n constraints. The greedy algorithm is the first to come within a co nstant factor of the optimum; it guarantees a solution that uses no mo re than twice the minimum number of reversals. The lower and upper bou nds of the branch-and-bound algorithm are a novel application of maxim um-weight matchings, shortest paths, and linear programming. In a seri es of experiments, we study the performance of an implementation on ra ndom permutations, and permutations generated by random reversals. For permutations differing by k random reversals, we find that the averag e upper bound on reversal distance estimates k to within one reversal for k < 1/2n and n less than or equal to 100. For the difficult case o f random permutations, we find that the average difference between the upper and lower bounds is less than three reversals for n less than o r equal to 50. Due to the tightness of these bounds, we can solve, to optimality, problems on 30 elements in a few minutes of computer time. This approaches the scale of mitochondrial genomes.