Motivation: Circular permutation of a protein is a genetic operation in whi
ch parr of the C-terminal of the protein is moved to its N-terminal. Recent
ly; it has been shown that proteins that undergo engineered circular permut
ations generally maintain their three dimensional structure and biological
function. This observation raises the possibility that circular permutation
has occured in Nature during evolution. In this scenario a protein underwe
nt circular permutation into another protein, there after both proteins fur
ther diverged by standard genetic operations To study this possibility one
needs an efficient algorithm that for a given pair of proteins can detect t
he underlying event of circular permutations. A possible formal description
of the question is: given two sequences, find a circular permutation of on
e of them under,which the edit distance between the proteins is minimal. A
naive algorithm might take time proportional to N-3 or even N-4, which is p
rohibitively slow for a large-scale survey. A sophisticated algorithm that
runs in asymptotic time of N-2 was recently suggested, but it is not practi
cal for a large-scale survey:
Results: A simple and efficient algorithm that runs in time N2 is presented
The algorithm is based on duplicating one of the two sequences, and then p
erforming a modified version of the standard dynamic programming algorithm.
While the algorithm is not guaranteed to find the optimal results, we pres
ent data that indicate that in practice the algorithm performs very well.
Availability: A Fortran program that calculates the optimal edit distance u
nder circular permutation is available upon request from the authors.
Contact: ron@biocom1.ls.biu.ac.il.