ITA
ENG

Chinese text distinction and font identification by recognizing most frequently used characters

Authors

Lin, CF Fang, YF Juang, YT

Citation

Cf. Lin et al., Chinese text distinction and font identification by recognizing most frequently used characters, IMAGE VIS C, 19(6), 2001, pp. 329-338

Citations number

Categorie Soggetti

AI Robotics and Automatic Control

Journal title

IMAGE AND VISION COMPUTING

ISSN journal

02628856 → ACNP

Volume

Issue

Year of publication

2001

Pages

329 - 338

Database

ISI

SICI code

0262-8856(20010415)19:6<329:CTDAFI>2.0.ZU;2-A

Abstract

In this study, the method of implementing the three functions that can offe r great help for a traditional OCCR (Optical Chinese Character Recognition) system is proposed: (1) to identify the font used in a document; (2) to de tect and recognize the most frequently used (MFU) characters; and (3) to di stinguish between the machine-printed and hand-written characters. Accordin g to the study investigated by Chang and Chen (Proceedings of the ICCC, 199 4, pp. 310-316), about 20% of Chinese characters in a text document are pre dominated by the top-40 MFU characters. If those MFU characters in a text d ocument can be detected before adopting the traditional OCCR method, there will be great savings in computation time. The proposed method for character detection consists of the following three stages: the stage of segmentation, the stage of feature extraction, and th e stage of classification. In the first stage, based on the concept of proj ection profile, the method presented by Wang et al. (Pattern Recognition 30 (1997) 1213) is utilized to segment characters individually from the input text document. In the second stage, three different types of features are introduced, including the density of black pixels, the projection profile c ode, and the modified skeleton template. These features are used to check w hether the segmented character is semi-matched or fully-matched with the MF U template. Finally, in the last stage, based on the matching result, three different algorithms for implementing the aforementioned functions are pro vided. Experimental results are given in this study to demonstrate the prac ticality and superiority of the proposed method. (C) 2001 Elsevier Science B.V. All rights reserved.