INTEGRATED APPROACH TO DOCUMENT DECOMPOSITION AND STRUCTURAL-ANALYSIS

Citation
D. Niyogi et Sn. Srihari, INTEGRATED APPROACH TO DOCUMENT DECOMPOSITION AND STRUCTURAL-ANALYSIS, International journal of imaging systems and technology, 7(4), 1996, pp. 330-342
Citations number
18
Categorie Soggetti
Optics,"Engineering, Eletrical & Electronic
ISSN journal
08999457
Volume
7
Issue
4
Year of publication
1996
Pages
330 - 342
Database
ISI
SICI code
0899-9457(1996)7:4<330:IATDDA>2.0.ZU;2-8
Abstract
A document image is a visual representation of a paper document, such as a journal article page, a cover page of facsimile transmission, off ice correspondence, an application form, etc, Document image understan ding as a research endeavor consists of developing processes for takin g a document through various representations, from scanned image to se mantic representation, This article describes document decomposition a nd structural analysis, which constitutes one of the major processes i nvolved in document image understanding. The current state of the art and future directions in the areas of document segmentation, layout an alysis, and logical block grouping are indicated, A system that perfor ms decomposition and structural analysis (including logical grouping a nd read-order determination) on complex multiarticled documents is pre sented. This system uses bottom-up segmentation techniques to identify the block structure of a document, and layout rules to classify and g roup these blocks into logical units that represent meaningful subdivi sions of the document. Experimental results showing the efficiency of this approach are presented and discussed. (C) 1996 John Wiley & Sons, Inc.