A METHOD OF POST-PROCESSING FOR CHARACTER-RECOGNITION BASED ON SYNTACTIC AND SEMANTIC ANALYSIS OF SENTENCES

Citation
K. Kise et al., A METHOD OF POST-PROCESSING FOR CHARACTER-RECOGNITION BASED ON SYNTACTIC AND SEMANTIC ANALYSIS OF SENTENCES, Systems and computers in Japan, 27(9), 1996, pp. 94-107
Citations number
16
Categorie Soggetti
Computer Science Hardware & Architecture","Computer Science Information Systems","Computer Science Theory & Methods
ISSN journal
08821666
Volume
27
Issue
9
Year of publication
1996
Pages
94 - 107
Database
ISI
SICI code
0882-1666(1996)27:9<94:AMOPFC>2.0.ZU;2-4
Abstract
Post-processing of character recognition refers to the processing used to correct the errors in character recognition. When the input is a s tring representing a sentence in the highly precise error correction, it is desired that the syntactic as well as semantic examinations shou ld be made at the sentence level. This paper assumes that the morpheme s, syntax and semantics of the input sentence can be analyzed, and pro poses a method that uses the syntactic and semantic analysis in the po st-processing. The proposed method receives the list of candidate char acters up to the fifth, and outputs the sentence that is adequate from the viewpoints of both syntax and semantics. The method features the following three points: (1) in word matching, it is examined also whet her or not a sentence adequate from the viewpoints of syntax and seman tics can he composed, and then the inadequate words extraction is inhi bited; (2) characters having stronger syntactic and semantic constrain ts, such as the single-character particle and the conjugational suffix , are estimated top-down. Then, the case where the adequate character is not contained in the candidates can be handled; and (3) the words f or which the adequateness cannot be determined from the syntactic or s emantic viewpoint are selected by character re-recognition processing. An experiment is executed for 50 sample sentences. The character reco gnition rate is improved from 83.0 percent to 98.0 percent, and the se ntence recognition rate is improved from 10.0 percent to 94.0 percent. Compared to the method based only on word matching, the sentence reco gnition rate is improved by more than 20 percent. In other words, the effectiveness of the proposed method is demonstrated.