The saying 'a picture is worth a thousand words' exemplifies the great
value of pictures in describing a scenario. Pictures convey spatial i
nformation in a compact form, allowing textual descriptions to concent
rate on the non-spatial (henceforth, contextual) properties of objects
. The difficult task in integrating text and diagrammatic input to a s
ystem is to establish coreference - matching object references in the
text to objects in the diagram. We show that the coreference problem c
an be greatly simplified if limited contextual information can be prov
ided directly in diagrams. We present a methodology, the Picture Seman
tics description language, for associating contextual information with
objects drawn through graphical editors. Then, we describe our implem
ented research tool, the Figure Understander, which uses this methodol
ogy to integrate the differing information in text and graphically-dra
wn diagrammatic input into a single unified knowledge base description
. We illustrate the utility of our methods through examples from two i
ndependent domains.