A SCRIPT-INDEPENDENT METHODOLOGY FOR OPTICAL CHARACTER-RECOGNITION

Citation
J. Makhoul et al., A SCRIPT-INDEPENDENT METHODOLOGY FOR OPTICAL CHARACTER-RECOGNITION, Pattern recognition, 31(9), 1998, pp. 1285-1294
Citations number
36
Categorie Soggetti
Computer Science Artificial Intelligence","Engineering, Eletrical & Electronic","Computer Science Artificial Intelligence
Journal title
ISSN journal
00313203
Volume
31
Issue
9
Year of publication
1998
Pages
1285 - 1294
Database
ISI
SICI code
0031-3203(1998)31:9<1285:ASMFOC>2.0.ZU;2-X
Abstract
We present a methodology for OCR that exhibits the following propertie s: script-independent feature extraction, training, and recognition co mponents; no separate segmentation at the character and word levels; a nd the training is performed automatically on data that is also not pr esegmented. The methodology is adapted to OCR from continuous speech r ecognition, which has developed a mature and successful technology bas ed on Hidden Markov Models. The script independence of the methodology is demonstrated using omnifont experiments on the DARPA Arabic OCR Co rpus and the University of Washington English Document Image Database I. (C) 1998 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.