FAST APPROXIMATE MATCHING OF WORDS AGAINST A DICTIONARY

Authors
Citation
H. Bunke, FAST APPROXIMATE MATCHING OF WORDS AGAINST A DICTIONARY, Computing, 55(1), 1995, pp. 75-89
Citations number
23
Categorie Soggetti
Computer Sciences","Computer Science Theory & Methods
Journal title
ISSN journal
0010485X
Volume
55
Issue
1
Year of publication
1995
Pages
75 - 89
Database
ISI
SICI code
0010-485X(1995)55:1<75:FAMOWA>2.0.ZU;2-4
Abstract
A new algorithm for string edit distance computation is given. The alg orithm assumes that one of the two strings to be compared is a diction ary entry that is known a priori. This dictionary word is converted in an off-line phase into a deterministic finite state automaton. Given an input string and the automaton derived from the dictionary word, th e computation of the edit distance between the two strings corresponds to a traversal of the states of the automaton. This procedure needs t ime which is only linear in the length of the input string. It is inde pendent of the length of the dictionary word. Given not only one but N different dictionary words, their corresponding automata can be combi ned into a single deterministic finite state automaton. Thus the compu tation of the edit distance between the input word and each dictionary entry, and the determination of the nearest neighbor in the dictionar y need time that is only linear in the length of the input string. How ever, the number os states of the automaton is exponential.