ITA
ENG

Text indexing and dictionary matching with one error

Authors

Amir, A Keselman, D Landau, GM Lewenstein, M Lewenstein, N Rodeh, M

Citation

A. Amir et al., Text indexing and dictionary matching with one error, J ALGORITHM, 37(2), 2000, pp. 309-325

Citations number

Categorie Soggetti

Computer Science & Engineering

Journal title

JOURNAL OF ALGORITHMS

ISSN journal

01966774 → ACNP

Volume

Issue

Year of publication

2000

Pages

309 - 325

Database

ISI

SICI code

0196-6774(200011)37:2<309:TIADMW>2.0.ZU;2-W

Abstract

The indexing problem is where a text is preprocessed and subsequent queries of the form "Find all occurrences of pattern P in the text" are answered i n time proportional to the length of the query and the number of occurrence s. In the dictionary matching problem a set of patterns is preprocessed and subsequent queries of the form "Find all occurrences of dictionary pattern s in text T" are answered in time proportional to the length of the text an d the number of occurrences. There exist efficient worst-case solutions for the indexing problem and the dictionary matching problem, but none that find approximate occurrences of the patterns, i.e., where the pattern is within a bound edit (or Hamming) distance from the appropriate text location. In this paper we present a uniform deterministic solution to both the index ing and the general dictionary matching problem with one error. We preproce ss the data in time O(n log(2) n), where n is the text size in the indexing problem and the dictionary size in the dictionary matching problem. Our qu ery time for the indexing problem is O(m log n log log n + tocc), where m i s the query string size and tocc is the number of occurrences, Our query ti me for the dictionary matching problem is O(n log(3) d log log d + tocc), w here n is the text size and d the dictionary size. The time bounds above ap ply to both bounded and unbounded alphabets, (C) 2000 Academic Press.