Bb. Chaudhuri et U. Garain, An approach for recognition and interpretation of mathematical expressionsin printed document, PATTERN A A, 3(2), 2000, pp. 120-131
In this paper, we propose an approach for understanding Mathematical Expres
sions (MEs) in a printed document. The system is divided into three main co
mponents: (i) detection of MEs in a document; (ii) recognition of the symbo
ls present in each ME; and (iii) arrangement of the recognised symbols. The
MEs printed in separate lines are detected without any character recogniti
on whereas the embedded expressions (mixed with normal text) are detected b
y recognising the mathematical symbols in text. Some structural features of
the MEs are used for both cases. The mathematical symbols are grouped into
two classes for convenience. At first, the frequently occurring symbols ar
e recognised by a stroke-feature analysis technique. Recognition of less fr
equent symbols involves a hybrid of feature based and template-based techni
que. The bounding-box coordinates and the size information of the symbols h
elp to determine the spatial relationships among the symbols. A set of pred
efined rules is used to form the meaningful symbol groups so that a logical
arrangement of the mathematical expression can be obtained. Experiments co
nducted using this approach on a large number of documents show high accura
cy.