INCREMENTAL STRING COMPARISON

Citation
Gm. Landau et al., INCREMENTAL STRING COMPARISON, SIAM journal on computing, 27(2), 1998, pp. 557-582
Citations number
28
Categorie Soggetti
Computer Science Theory & Methods",Mathematics,"Computer Science Theory & Methods",Mathematics
Journal title
ISSN journal
00975397
Volume
27
Issue
2
Year of publication
1998
Pages
557 - 582
Database
ISI
SICI code
0097-5397(1998)27:2<557:>2.0.ZU;2-S
Abstract
The problem of comparing two sequences A and B to determine their long est common subsequence (LCS) or the edit distance between them has bee n much studied. In this paper we consider the following incremental ve rsion of these problems: given an appropriate encoding of a comparison between A and B, can one incrementally compute the answer for A and b B, and the answer for A and Bb with equal efficiency, where b is an ad ditional symbol? Our main result is a theorem exposing a surprising re lationship between the dynamic programming solutions for two such ''ad jacent'' problems. Given a threshold k on the number of differences to be permitted in an alignment, the theorem leads directly to an O(k) a lgorithm for incrementally computing a new solution from an old one, a s contrasts the O(k(2)) time required to compute a solution from scrat ch. We further show, with a series of applications, that this algorith m is indeed more powerful than its nonincremental counterpart. We show this by solving the applications with greater asymptotic efficiency t han heretofore possible. For example, we obtain O(nk) algorithms for t he longest prefix approximate match problem, the approximate overlap p roblem, and cyclic string comparison.