This paper presents general purpose video analysis and annotation tools, wh
ich combine high-level and low-level information, and which learn through u
ser interaction and feedback. The use of these tools is illustrated through
the construction of two video browsers, which allow a user to fast forward
(or rewind) to Frames, shots, or scenes containing a particular character,
characters, or other labeled content. The two browsers developed in this w
ork are: (1) a basic video browser, which exploits relations between high-l
evel scripting information and closed captions, and (2) an advanced video b
rowser, which augments the basic browser with annotations gained from apply
ing machine learning. The learner helps the system adapt to different peopl
es' labelings by accepting positive and negative examples of labeled conten
t from a user, and relating these to low-level color and texture features e
xtracted from the digitized video. This learning happens interactively, and
is used to infer labels on data the user has not yet seen. The labeled dat
a may then be browsed or retrieved from the database in real time. An evalu
ation of the learning performance shows that a combination of low-level col
or signal features outperforms several other combinations of signal feature
s in learning character labels in an episode of the TV situation comedy. Se
infeld. We discuss several issues that arise in the combination of low-leve
l and high-level information. and illustrate solutions to these issues with
in the context of browsing television sitcoms.