HOW THE CHARACTER COMPARISON ORDER SHAPES THE SHIFT FUNCTION OF ONLINE PATTERN-MATCHING ALGORITHMS

Citation
L. Colussi et L. Toniolo, HOW THE CHARACTER COMPARISON ORDER SHAPES THE SHIFT FUNCTION OF ONLINE PATTERN-MATCHING ALGORITHMS, Theoretical computer science, 163(1-2), 1996, pp. 117-144
Citations number
27
Categorie Soggetti
Computer Sciences","Computer Science Theory & Methods
ISSN journal
03043975
Volume
163
Issue
1-2
Year of publication
1996
Pages
117 - 144
Database
ISI
SICI code
0304-3975(1996)163:1-2<117:HTCCOS>2.0.ZU;2-3
Abstract
String matching is the problem of finding all occurrences of a string W[0...m - 1] of length m called a pattern, in a longer string J[0...n - 1] of length n called a text. Several string matching algorithms hav e been designed to solve the problem in linear time; most of them work in two steps, called pattern preprocessing and text search step. The paper addresses the definition and computation of the shift function i n the pattern preprocessing step of on-line string matching algorithms . The shift function depends essentially on the order the pattern char acters are compared with the corresponding text characters. We conside r a family F of algorithms that do not change the character comparison order J during execution and we present a uniform definition of shift function delta(J) for such algorithms via a function imin(J). The def inition allows one to compute delta(J) in O(m log log m) time in the w orst case, given imin(J), but sufficient conditions to compute delta(J ) in O(m) time are provided. Computing imin(J) requires O(m(2)) compar isons in general. We introduce the class of compact orders (which is t he generalization of Knuth-Morris-Pratt, Boyer-Moore and Crochemore-Pe rrin character comparison orders) and we give algorithms to compute bo th function imin(J) and shift function delta(J) in O(m) time for all c ompact orders. We show that given the order J and the pattern W there exists a set C of equivalent orders such that the function imin(K) can be computed in linear time given imin(J) for all orders K is an eleme nt of C. Moreover, we characterize two orders in the set C that respec tively minimize and maximize the values of the shift function and we s how that for both those orders the shift function can be computed in l inear time given imin(J).