T. Watanabe et al., LAYOUT RECOGNITION OF MULTI-KINDS OF TABLE-FORM DOCUMENTS, IEEE transactions on pattern analysis and machine intelligence, 17(4), 1995, pp. 432-445
Many approaches have reported that knowledge-based layout recognition
methods are very successful to classify the meaningful data from docum
ent images automatically. However, these approaches are applicable to
only the same kind of documents because they are based on the paradigm
that specifies the structure definition information in advance so as
to be able to analyze a particular class of documents intelligently. I
n this paper, we propose a method to recognize the layout structures o
f multi-kinds of table-form document images. For this purpose, we intr
oduce a classification tree to manage the relationships among differen
t classes of layout structures. Our recognition system has two modes:
layout knowledge acquisition and layout structure recognition. In the
layout knowledge acquisition mode, table-form document images are dist
inguished according to this. classification tree and then the structur
e description trees which specify the logical structures of table-form
documents are generated automatically. While, in the layout structure
recognition mode, individual item fields in the table-form document i
mages are extracted and classified successfully by searching the class
ification tree and interpreting the structure description tree.