A guided tour to approximate string matching

Authors
Citation
G. Navarro, A guided tour to approximate string matching, ACM C SURV, 33(1), 2001, pp. 31-88
Citations number
155
Categorie Soggetti
Computer Science & Engineering
Journal title
ACM COMPUTING SURVEYS
ISSN journal
03600300 → ACNP
Volume
33
Issue
1
Year of publication
2001
Pages
31 - 88
Database
ISI
SICI code
0360-0300(200103)33:1<31:AGTTAS>2.0.ZU;2-Q
Abstract
We survey the current techniques to cope with the problem of string matchin g that allows errors. This is becoming a more and more relevant issue for m any fast growing areas such as information retrieval and computational biol ogy. We focus on online searching and mostly on edit distance, explaining t he problem and its relevance, its statistical behavior, its history and cur rent developments, and the central ideas of the algorithms and their comple xities. We present a number of experiments to compare the performance of th e different algorithms and show which are the best choices. We conclude wit h some directions for future work and open problems.