Two- and higher-dimensional pattern matching in optimal expected time

Citation
J. Karkkainen et E. Ukkonen, Two- and higher-dimensional pattern matching in optimal expected time, SIAM J COMP, 29(2), 1999, pp. 571-589
Citations number
26
Categorie Soggetti
Computer Science & Engineering
Journal title
SIAM JOURNAL ON COMPUTING
ISSN journal
00975397 → ACNP
Volume
29
Issue
2
Year of publication
1999
Pages
571 - 589
Database
ISI
SICI code
0097-5397(199912)29:2<571:TAHPMI>2.0.ZU;2-B
Abstract
Algorithms with optimal expected running time are presented for searching t he occurrences of a two-dimensional m x m pattern P in a two-dimensional n x n text T over an alphabet of size c. The algorithms are based on placing in the text a static grid of test points, determined only by n, m, and c (n ot dynamically by earlier test results). Using test strings read from the t est points the algorithms eliminate as many potential occurrences of P as p ossible. The remaining potential occurrences are separately checked for act ual occurrences. A suitable choice of the test point set leads to algorithm s with expected running time O(n(2) log(c) m(2)/m(2)) using the uniform Ber noulli model of randomness. This is shown to be optimal by a generalization of a one-dimensional lower bound result by Yao. Experimental results show that the algorithms are efficient in practice, too. The method is also gene ralized for the k mismatches problem. The resulting algorithm has expected running time O(kn(2) log(c) m(2)/m(2)), provided that k less than or equal to (mright perpendicularm/inverted right perpendicularlog(c) m(2)inverted l eft perpendicularleft perpendicular - 1)/2. All algorithms need preprocessi ng of P which takes time and space O(m(2)). The text processing can be done on-line, using a rather small window. The algorithms easily generalize to d-dimensional matching for any d.