Bj. Oommen et Rks. Loke, PATTERN-RECOGNITION OF STRINGS WITH SUBSTITUTIONS, INSERTIONS, DELETIONS AND GENERALIZED TRANSPOSITION, Pattern recognition, 30(5), 1997, pp. 789-800
Citations number
23
Categorie Soggetti
Computer Sciences, Special Topics","Engineering, Eletrical & Electronic","Computer Science Artificial Intelligence
We study the problem of recognizing a string Y which is the noisy vers
ion of some unknown string X chosen from a finite dictionary, II. The
traditional case which has been extensively studied in the literature
is the one in which Y contains substitution, insertion and deletion (
SID) errors. Although some work has been done to extend the traditiona
l set of edit operations to include the straightforward transposition
of adjacent characters(2) [see J. Assoc. Comput. Mach. 22, 177-183 (19
75)] the problem is unsolved when the transposed characters are themse
lves subsequently substituted, as is typical in cursive and typewritte
n script, in molecular biology and in noisy chain-coded boundaries. In
this paper we present the first reported solution to the analytic pro
blem of editing one string X to another, Y, using these four edit oper
ations. A scheme for obtaining the optimal edit operations has also be
en given. Both these solutions are optimal for the infinite alphabet c
ase. Using these algorithms we present a syntactic pattern recognition
scheme which corrects noisy text containing all these types of errors
. The paper includes experimental results involving sub-dictionaries o
f the most common English words which demonstrate the superiority of o
ur system over existing methods. (C) 1997 Pattern Recognition Society.