The multiple instance problem arises in tasks where the training examp
les are ambiguous: a single example object may have many alternative f
eature vectors (instances) that describe it, and yet only one of those
feature vectors may be responsible for the observed classification of
the object. This paper describes and compares three kinds of algorith
ms that learn axis-parallel rectangles to solve the multiple instance
problem. Algorithms that ignore the multiple instance problem perform
very poorly. An algorithm that directly confronts the multiple instanc
e problem (by attempting to identify which feature vectors are respons
ible for the observed classifications) performs best, giving 89% corre
ct predictions on a musk odor prediction task. The paper also illustra
tes the use of artificial data to debug and compare these algorithms.