K. Marukawa et al., DOCUMENT-RETRIEVAL TOLERATING CHARACTER-RECOGNITION ERRORS - EVALUATION AND APPLICATION, Pattern recognition, 30(8), 1997, pp. 1361-1371
Citations number
17
Categorie Soggetti
Computer Sciences, Special Topics","Engineering, Eletrical & Electronic","Computer Science Artificial Intelligence
This paper presents two methods of combining character recognition wit
h techniques for retrieving Japanese documents and also shows how thes
e methods can be applied to textual image retrieval. Both retrieval me
thods are tolerant of errors that occur during the character recogniti
on process. The basic idea is to utilize the characteristics of recogn
ition errors. One uses a confusion matrix to generate ''equivalent'' q
uery strings that should match erroneously recognized text. The other
one searches a ''non-deterministic text'' that contains multiple candi
dates for ambiguous recognition results. Simulation experiments have s
hown that both methods can effectively combine character recognition w
ith retrieval techniques. (C) 1997 Pattern Recognition Society. Publis
hed by Elsevier Science Ltd.