ITA
ENG

MULTIPLE FILTRATION AND APPROXIMATE PATTERN-MATCHING

Authors

PEVZNER PA WATERMAN MS

Citation

Pa. Pevzner et Ms. Waterman, MULTIPLE FILTRATION AND APPROXIMATE PATTERN-MATCHING, Algorithmica, 13(1-2), 1995, pp. 135-154

Citations number

Categorie Soggetti

Computer Sciences",Mathematics,Mathematics,"Computer Science Software Graphycs Programming

Journal title

Algorithmica → ACNP

ISSN journal

01784617

Volume

Issue

1-2

Year of publication

1995

Pages

135 - 154

Database

ISI

SICI code

0178-4617(1995)13:1-2<135:MFAAP>2.0.ZU;2-4

Abstract

Given a text of length n and a query of length q, we present an algori thm for finding all locations of m-tuples in the text and in the query that differ by at most k mismatches. This problem is motivated by the dot-matrix constructions for sequence comparison and optimal oligonuc leotide probe selection routinely used in molecular biology. In the ca se q = m the problem coincides with the classical approximate string m atching with k mismatches problem. We present a new approach to this p roblem based on multiple hashing, which may have advantages over some sophisticated and theoretically efficient methods that have been propo sed. This paper describes a two-stage process. The first stage (multip le filtration) uses a new technique to preselect roughly similar m-tup les. The second stage compares these In-tuples using an accurate metho d. We demonstrate the advantages of multiple filtration in comparison with other techniques for approximate pattern matching.