Multiple similarity queries: A basic DBMS operation for mining in metric databases

Citation
B. Braunmuller et al., Multiple similarity queries: A basic DBMS operation for mining in metric databases, IEEE KNOWL, 13(1), 2001, pp. 79-95
Citations number
29
Categorie Soggetti
AI Robotics and Automatic Control
Journal title
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
ISSN journal
10414347 → ACNP
Volume
13
Issue
1
Year of publication
2001
Pages
79 - 95
Database
ISI
SICI code
1041-4347(200101/02)13:1<79:MSQABD>2.0.ZU;2-G
Abstract
Metric databases are databases where a metric distance function is defined for pairs of database objects. In such databases, similarity queries in the form of range queries or k-nearest-neighbor queries are the most important query types. In traditional query processing, single queries are issued in dependently by different users. In many data mining applications, however, the database is typically explored by iteratively asking similarity queries for answers of previous similarity queries. In this paper, we introduce a generic scheme for such data mining algorithms and we investigate two ortho gonal approaches, reducing I/O cost as well as CPU cost, to speed-up the pr ocessing of multiple similarity queries. The proposed techniques apply to a ny type of similarity query and to an implementation based on an index or u sing a sequential scan. Parallelization yields an additional impressive spe ed-up. An extensive performance evaluation confirms the efficiency of our a pproach.