In a traditional database system, the result of a query is a set of values
(those values that satisfy the query). In other data servers, such as a sys
tem with queries based on image content, or many text retrieval systems, th
e result of a query is a sorted list. For example, in the case of a system
with queries based on image content, the query might ask for objects that a
re a particular shade of red, and the result of the query would be a sorted
list of objects in the database, sorted by how well the color of the objec
t marches that given in the query. A multimedia system must somehow synthes
ize both types of queries (those whose result is a set and those whose resu
lt is a sorted list) in a consistent manner. In this paper we discuss the s
olution adopted by Garlic, a multimedia information system being developed
at the IBM Almaden Research Center. This solution is based on "graded" (or
"fuzzy") sets.
Issues of efficient query evaluation in a multimedia system are very differ
ent from those in a traditional database system. This is because the multim
edia system receives answers to subqueries from various subsystems, which c
an be accessed only in limited ways. For the important class of queries tha
t are conjunctions of atomic queries (where each atomic query might be eval
uated by a different subsystem), the naive algorithm must retrieve a number
of elements that is linear in the database size. In contrast, in this pape
r an algorithm is given, which has been implemented in Garlic, such that if
the conjuncts are independent, then with arbitrarily high probability, the
total number of elements retrieved in evaluating the query is sublinear in
the database size (in the case of two conjuncts, it is of the order of the
square root of the database size). It is also shown that for such queries,
the algorithm is optimal. The matching upper and lower bounds are robust,
in the sense that they hold under almost any reasonable rule ( including th
e standard min rule of fuzzy logic) for evaluating the conjunction. Finally
, we find a query that is provably hard, in the sense that the naive linear
algorithm is essentially optimal. (C) 1999 Academic Press.