This paper demonstrates a new application of computer vision to digita
l libraries - the use of texture for annotation, the description of co
ntent. Vision-based annotation assists the user in attaching descripti
ons to large sets of images and video. If a user labels a piece of an
image as water, a texture model can be used to propagate this label to
other ''visually similar'' regions. However, a serious problem is tha
t no single model has been found that is good enough to match reliably
human perception of similarity in pictures. Rather than using one mod
el, the system described here knows several texture models, and is equ
ipped with the ability to choose the one that ''best explains'' the re
gions selected by the user for annotating. If none of these models suf
fices, then it creates new explanations by combining models. Examples
of annotations propagated by the system on natural scenes are given. T
he system tem provides an average gain of four to one in label predict
ion for a set of 98 images.