RECOGNITION AND DATA EXTRACTION OF FORM DOCUMENTS BASED ON 3 TYPES OFLINE SEGMENTS

Authors
Citation
Ly. Tseng et Rc. Chen, RECOGNITION AND DATA EXTRACTION OF FORM DOCUMENTS BASED ON 3 TYPES OFLINE SEGMENTS, Pattern recognition, 31(10), 1998, pp. 1525-1540
Citations number
15
Categorie Soggetti
Computer Science Artificial Intelligence","Engineering, Eletrical & Electronic","Computer Science Artificial Intelligence
Journal title
ISSN journal
00313203
Volume
31
Issue
10
Year of publication
1998
Pages
1525 - 1540
Database
ISI
SICI code
0031-3203(1998)31:10<1525:RADEOF>2.0.ZU;2-U
Abstract
Almost all form documents contain line segments. In this paper, we pro pose an efficient method to recognize the form document that contains at least one line segment. Our method is based on an efficient represe ntation model of the form. The representation model uses three types o f line segments to represent a form. All line segments are normalized and sorted after they were extracted. The normalization and sorting no t only solve the form scaling problem but also provide an unified and efficient way of matching between forms. To make the recognition metho d more robust, a fuzzy matching is used. Using the representation mode l, when recognizing a skew form, only the line segments and the data f ields instead of the whole form image need to be rotated. Experimental results show the effectiveness and the efficiency of the method. (C) 1998 Pattern Recognition Society. Published by Elsevier Science Ltd. A ll rights reserved.