PROCESSING OF BINARY IMAGES OF HANDWRITTEN TEXT DOCUMENTS

Citation
Isi. Abuhaiba et al., PROCESSING OF BINARY IMAGES OF HANDWRITTEN TEXT DOCUMENTS, Pattern recognition, 29(7), 1996, pp. 1161-1177
Citations number
21
Categorie Soggetti
Computer Sciences, Special Topics","Engineering, Eletrical & Electronic","Computer Science Artificial Intelligence
Journal title
ISSN journal
00313203
Volume
29
Issue
7
Year of publication
1996
Pages
1161 - 1177
Database
ISI
SICI code
0031-3203(1996)29:7<1161:POBIOH>2.0.ZU;2-D
Abstract
This paper deals with three different problems in the processing of bi nary images of handwritten text documents. Firstly, an integrated algo rithm that finds a straight line approximation of a textual stroke is described. It has the advantage of using the distance transform of thi nned binary images to identify spurious bifurcation points, which are unavoidable when thinning algorithms are used, remove them and recover the original ones. The obtained straight line approximations preserve the structural information of the original pattern. The algorithm doe s not resort to distortable geometrical properties. Secondly, a method is presented to recover]oops that become blobs due to blotting. The m ethod depends on removing the pixels whose distance transform exceeds a calculated threshold. Unfortunately, it seems that it is not possibl e to recover such loops with a high rate of success. The authors sugge st that the inclusion of thickness information, in the line segments t hat connect the vertices of the straight line approximations produced by the previous algorithm, is a step towards a solution of this proble m. Finally, a method is developed to extract lines from pages of handw ritten text, by finding the shortest spanning tree of a graph Formed f rom the set of main strokes. Then, main strokes of extracted lines are arranged in the same order as they were written by following the path in which they are contained. Then, every secondary stroke is assigned to the closest main stroke. At the end, an ordered list of main strok es, each with the corresponding number of assigned secondary strokes, is obtained. Each combination of main-secondary strokes can be the inp ut to a subsequent recognition stage. The method proved to be powerful and more suited to variable handwriting. Copyright (C) 1996 Pattern R ecognition Society.