D. Niyogi et Sn. Srihari, INTEGRATED APPROACH TO DOCUMENT DECOMPOSITION AND STRUCTURAL-ANALYSIS, International journal of imaging systems and technology, 7(4), 1996, pp. 330-342
A document image is a visual representation of a paper document, such
as a journal article page, a cover page of facsimile transmission, off
ice correspondence, an application form, etc, Document image understan
ding as a research endeavor consists of developing processes for takin
g a document through various representations, from scanned image to se
mantic representation, This article describes document decomposition a
nd structural analysis, which constitutes one of the major processes i
nvolved in document image understanding. The current state of the art
and future directions in the areas of document segmentation, layout an
alysis, and logical block grouping are indicated, A system that perfor
ms decomposition and structural analysis (including logical grouping a
nd read-order determination) on complex multiarticled documents is pre
sented. This system uses bottom-up segmentation techniques to identify
the block structure of a document, and layout rules to classify and g
roup these blocks into logical units that represent meaningful subdivi
sions of the document. Experimental results showing the efficiency of
this approach are presented and discussed. (C) 1996 John Wiley & Sons,
Inc.