Document image template matching based on component block list

Citation
Hc. Peng et al., Document image template matching based on component block list, PATT REC L, 22(9), 2001, pp. 1033-1042
Citations number
11
Categorie Soggetti
AI Robotics and Automatic Control
Journal title
PATTERN RECOGNITION LETTERS
ISSN journal
01678655 → ACNP
Volume
22
Issue
9
Year of publication
2001
Pages
1033 - 1042
Database
ISI
SICI code
0167-8655(200107)22:9<1033:DITMBO>2.0.ZU;2-F
Abstract
Document image matching is the key technique for document image registratio n and retrieval. In this paper, a new matching method based on document com ponent block list (CBL) is proposed. A document image is firstly parsed int o a number of component blocks that are defined as non-adherent rectangular areas of substantial document contents. Then these blocks are organized as a list, on which several matching operations are defined. The template ima ge that is most similar to the querying document image is selected as the m atching result. Our method can effectively make use of the local informatio n of each page component block and the global information of document page layout. We investigate the method with large-scale document template image database. Our method manifests good matching accuracy and good robustness t o image distortion, filled-in text, and noises. (C) 2001 Published by Elsev ier Science B.V.