ITA
ENG

Integrating knowledge sources in Devanagari text recognition system

Authors

Bansal, V Sinha, RMK

Citation

V. Bansal et Rmk. Sinha, Integrating knowledge sources in Devanagari text recognition system, IEEE SYST A, 30(4), 2000, pp. 500-505

Citations number

Categorie Soggetti

AI Robotics and Automatic Control

Journal title

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS

ISSN journal

10834427 → ACNP

Volume

Issue

Year of publication

2000

Pages

500 - 505

Database

ISI

SICI code

1083-4427(200007)30:4<500:IKSIDT>2.0.ZU;2-D

Abstract

The reading process has been widely studied and there is a general agreemen t among researchers that knowledge in different forms and at different leve ls plays a vital role. This is the underlying philosophy of the Devanagari document recognition system described in this work. The knowledge sources w e use are mostly statistical in nature or in the form of a word dictionary tailored specifically for optical character recognition (OCR). We do not pe rform any reasoning on these. However, we explore their relative importance and role in the hierarchy. Some of the knowledge sources are acquired a pr iori by an automated training process while others are extracted from the t ext as it is processed. A complete Devanagari OCR system has been designed and tested with real-lif e printed documents of varying size and font. Most of the documents used we re photocopies of the original. A performance of approximately 90% correct recognition is achieved.