Mr. Naphade et al., Audio-visual query and retrieval: A system that uses dynamic programming and relevance feedback, J ELECTR IM, 10(4), 2001, pp. 861-870
A necessary capability for content-based retrieval is to support the paradi
gm of query by example. In the past, there have been several attempts to us
e low-level features for video retrieval. However, most of the approaches s
upport queries using image sequences only. We present an algorithm for matc
hing multimodal (audio-visual) patterns for the purpose of content-based vi
deo retrieval The novel ability of our approach to use the information cont
ent in multiple media coupled with a strong emphasis on temporal similarity
differentiates it from the state of the art in content-based retrieval. At
the core of the pattern matching scheme is a dynamic programming algorithm
, which leads to a significant improvement in performance. Coupling the use
of audio with video this algorithm can be applied to grouping of shots bas
ed on audio-visual similarity. We also support relevance feedback. The user
can provide feedback to the system, by choosing clips, which are closer to
the user's desired target. The system then automatically adjusts the relat
ive weights or relevance of the media and fetches different sets of target
clips accordingly. It is our observation that a few iterations of such feed
back are generally sufficient, for retrieving the desired video clips. (C)
2001 SPIE and IS&T.