K. Kise et al., A METHOD OF POST-PROCESSING FOR CHARACTER-RECOGNITION BASED ON SYNTACTIC AND SEMANTIC ANALYSIS OF SENTENCES, Systems and computers in Japan, 27(9), 1996, pp. 94-107
Citations number
16
Categorie Soggetti
Computer Science Hardware & Architecture","Computer Science Information Systems","Computer Science Theory & Methods
Post-processing of character recognition refers to the processing used
to correct the errors in character recognition. When the input is a s
tring representing a sentence in the highly precise error correction,
it is desired that the syntactic as well as semantic examinations shou
ld be made at the sentence level. This paper assumes that the morpheme
s, syntax and semantics of the input sentence can be analyzed, and pro
poses a method that uses the syntactic and semantic analysis in the po
st-processing. The proposed method receives the list of candidate char
acters up to the fifth, and outputs the sentence that is adequate from
the viewpoints of both syntax and semantics. The method features the
following three points: (1) in word matching, it is examined also whet
her or not a sentence adequate from the viewpoints of syntax and seman
tics can he composed, and then the inadequate words extraction is inhi
bited; (2) characters having stronger syntactic and semantic constrain
ts, such as the single-character particle and the conjugational suffix
, are estimated top-down. Then, the case where the adequate character
is not contained in the candidates can be handled; and (3) the words f
or which the adequateness cannot be determined from the syntactic or s
emantic viewpoint are selected by character re-recognition processing.
An experiment is executed for 50 sample sentences. The character reco
gnition rate is improved from 83.0 percent to 98.0 percent, and the se
ntence recognition rate is improved from 10.0 percent to 94.0 percent.
Compared to the method based only on word matching, the sentence reco
gnition rate is improved by more than 20 percent. In other words, the
effectiveness of the proposed method is demonstrated.