O. Hori et Ds. Doermann, TABLE-FORM STRUCTURE-ANALYSIS BASED ON BOX-DRIVEN REASONING, IEICE transactions on information and systems, E79D(5), 1996, pp. 542-547
Table-form document structure analysis is an important problem in the
document processing domain. This paper presents a new method called Bo
x-Driven Reasoning (BDR) to robustly analyze the structure of table-fo
rm documents that include touching characters and broken lines. Real d
ocuments are copied repeatedly and overlaid with printed data, resulti
ng in characters that touch cells and lines that are broken. Most prev
ious methods employ a line-oriented approach, but touching characters
and broken lines make the procedure fail at an early stage. BDR deals
with regions directly in contrast with other previous methods and a re
duced resolution image is introduced to supplement information deterio
rated by noise. Experimental tests show that BDR reliably recognizes c
ells and strings in document images with touching characters and broke
n lines.