A TAXONOMY OF SUBLINEAR MULTIPLE KEYWORD PATTERN-MATCHING ALGORITHMS

Authors
Citation
Bw. Watson et G. Zwaan, A TAXONOMY OF SUBLINEAR MULTIPLE KEYWORD PATTERN-MATCHING ALGORITHMS, Science of computer programming, 27(2), 1996, pp. 85-118
Citations number
20
Categorie Soggetti
Computer Sciences","Computer Science Software Graphycs Programming
ISSN journal
01676423
Volume
27
Issue
2
Year of publication
1996
Pages
85 - 118
Database
ISI
SICI code
0167-6423(1996)27:2<85:ATOSMK>2.0.ZU;2-P
Abstract
This article presents a taxonomy of sublinear keyword pattern matching algorithms related to the Boyer-Moore algorithm [3] and the Commentz- Walter algorithm [5,6]. The taxonomy includes, amongst others, the mul tiple keyword generalization of the single keyword Boyer-Moore algorit hm and an algorithm by Fan and Su [9,10]. The corresponding precomputa tion algorithms are presented as well. The taxonomy is based on the id ea of ordering algorithms according to their essential problem and alg orithm details, and deriving all algorithms from a common starting poi nt by successively adding these details in a correctness preserving wa y. This way of presentation not only provides a complete correctness a rgument of each algorithm, but also makes very clear what algorithms h ave in common (the details of their nearest common ancestor) and where they differ (the details added after their nearest common ancestor). Introduction of the notion of safe shift distances proves to be essent ial in the derivation and classification of the algorithms. Moreover, the article provides a common derivation for and a uniform presentatio n of the precomputation algorithms, not yet found in the literature.