Text indexing and dictionary matching with one error

Citation
A. Amir et al., Text indexing and dictionary matching with one error, J ALGORITHM, 37(2), 2000, pp. 309-325
Citations number
28
Categorie Soggetti
Computer Science & Engineering
Journal title
JOURNAL OF ALGORITHMS
ISSN journal
01966774 → ACNP
Volume
37
Issue
2
Year of publication
2000
Pages
309 - 325
Database
ISI
SICI code
0196-6774(200011)37:2<309:TIADMW>2.0.ZU;2-W
Abstract
The indexing problem is where a text is preprocessed and subsequent queries of the form "Find all occurrences of pattern P in the text" are answered i n time proportional to the length of the query and the number of occurrence s. In the dictionary matching problem a set of patterns is preprocessed and subsequent queries of the form "Find all occurrences of dictionary pattern s in text T" are answered in time proportional to the length of the text an d the number of occurrences. There exist efficient worst-case solutions for the indexing problem and the dictionary matching problem, but none that find approximate occurrences of the patterns, i.e., where the pattern is within a bound edit (or Hamming) distance from the appropriate text location. In this paper we present a uniform deterministic solution to both the index ing and the general dictionary matching problem with one error. We preproce ss the data in time O(n log(2) n), where n is the text size in the indexing problem and the dictionary size in the dictionary matching problem. Our qu ery time for the indexing problem is O(m log n log log n + tocc), where m i s the query string size and tocc is the number of occurrences, Our query ti me for the dictionary matching problem is O(n log(3) d log log d + tocc), w here n is the text size and d the dictionary size. The time bounds above ap ply to both bounded and unbounded alphabets, (C) 2000 Academic Press.