ITA
ENG

On supporting containment queries in relational database management systems

Authors

Zhang, C Naughton, J DeWitt, D Luo, Q Lohman, G

Citation

C. Zhang et al., On supporting containment queries in relational database management systems, SIG RECORD, 30(2), 2001, pp. 425-436

Citations number

Categorie Soggetti

Computer Science & Engineering

Journal title

SIGMOD RECORD

ISSN journal

01635808 → ACNP

Volume

Issue

Year of publication

2001

Pages

425 - 436

Database

ISI

SICI code

0163-5808(200106)30:2<425:OSCQIR>2.0.ZU;2-3

Abstract

Virtually all proposals for querying XML include a class of query we term " containment queries". It is also clear that in the foreseeable future, a su bstantial amount of XML data will be stored in relational database systems. This raises the question of how to support these containment queries. The inverted list technology that underlies much of Information Retrieval is we ll-suited to these queries, but should we implement this technology (a) in a separate loosely-coupled IR engine, or (b) using the native tables and qu ery execution machinery of the RDBMS? With option (b), more than twenty yea rs of work on RDBMS query optimization, query execution, scalability, and c oncurrency control and recovery immediately extend to the queries and struc tures that implement these new operations. But all this will be irrelevant if the performance of option (b) lags that of (a) by too much. In this pape r, we explore some performance implications of both options using native im plementations in two commercial relational database systems and in a specia l purpose inverted list engine. Our performance study shows that while RDBM Ss are generally poorly suited for such queries, under certain conditions t hey can outperform an inverted list engine. Our analysis further identifies two significant causes that differentiate the performance of the IR and RD BMS implementations: the join algorithms employed and the hardware cache ut ilization. Our results suggest that contrary to most expectations, with som e modifications, a native implementation in an RDBMS can support this class of query much more efficiently