Combining fuzzy information from multiple systems

Authors
Citation
R. Fagin, Combining fuzzy information from multiple systems, J COMPUT SY, 58(1), 1999, pp. 83-99
Citations number
29
Categorie Soggetti
Computer Science & Engineering
Journal title
JOURNAL OF COMPUTER AND SYSTEM SCIENCES
ISSN journal
00220000 → ACNP
Volume
58
Issue
1
Year of publication
1999
Pages
83 - 99
Database
ISI
SICI code
0022-0000(199902)58:1<83:CFIFMS>2.0.ZU;2-G
Abstract
In a traditional database system, the result of a query is a set of values (those values that satisfy the query). In other data servers, such as a sys tem with queries based on image content, or many text retrieval systems, th e result of a query is a sorted list. For example, in the case of a system with queries based on image content, the query might ask for objects that a re a particular shade of red, and the result of the query would be a sorted list of objects in the database, sorted by how well the color of the objec t marches that given in the query. A multimedia system must somehow synthes ize both types of queries (those whose result is a set and those whose resu lt is a sorted list) in a consistent manner. In this paper we discuss the s olution adopted by Garlic, a multimedia information system being developed at the IBM Almaden Research Center. This solution is based on "graded" (or "fuzzy") sets. Issues of efficient query evaluation in a multimedia system are very differ ent from those in a traditional database system. This is because the multim edia system receives answers to subqueries from various subsystems, which c an be accessed only in limited ways. For the important class of queries tha t are conjunctions of atomic queries (where each atomic query might be eval uated by a different subsystem), the naive algorithm must retrieve a number of elements that is linear in the database size. In contrast, in this pape r an algorithm is given, which has been implemented in Garlic, such that if the conjuncts are independent, then with arbitrarily high probability, the total number of elements retrieved in evaluating the query is sublinear in the database size (in the case of two conjuncts, it is of the order of the square root of the database size). It is also shown that for such queries, the algorithm is optimal. The matching upper and lower bounds are robust, in the sense that they hold under almost any reasonable rule ( including th e standard min rule of fuzzy logic) for evaluating the conjunction. Finally , we find a query that is provably hard, in the sense that the naive linear algorithm is essentially optimal. (C) 1999 Academic Press.