Image understanding denotes not only the ability to extract specific, non-n
umerical information from images, but it implies also reasoning about the e
xtracted information. We propose a qualitative representation for image und
erstanding results, which is suitable for reasoning with Bayesian networks.
Our qualitative representation is enhanced with probabilistic information
to represent uncertainties and errors in the understanding of noisy sensory
data. The probabilistic information is supplied to a Bayesian network in o
rder to find the most plausible interpretation. We apply this approach for
the integration of image and speech understanding in a scenario where we wa
nt to find objects in a visually observed scene which are verbally describe
d by a human. Results demonstrate the performance of our approach. (C) 2000
Elsevier Science B.V. All rights reserved.