ITA
ENG

PERFORMANCE ANALYSIS OF 3 TEXT-JOIN ALGORITHMS

Authors

MENG WY YU C WANG W RISHE N

Citation

Wy. Meng et al., PERFORMANCE ANALYSIS OF 3 TEXT-JOIN ALGORITHMS, IEEE transactions on knowledge and data engineering, 10(3), 1998, pp. 477-492

Citations number

Categorie Soggetti

Computer Science Artificial Intelligence","Computer Science Information Systems","Engineering, Eletrical & Electronic","Computer Science Artificial Intelligence","Computer Science Information Systems

Journal title

IEEE transactions on knowledge and data engineering → ACNP

ISSN journal

10414347

Volume

Issue

Year of publication

1998

Pages

477 - 492

Database

ISI

SICI code

1041-4347(1998)10:3<477:PAO3TA>2.0.ZU;2-#

Abstract

When a multidatabase system contains textual database systems (i.e., i nformation retrieval systems), queries against the global schema of th e multidatabase system may contain a new type of joins-joins between a ttributes of textual type. Three algorithms for processing such a type of joins are presented and their I/O costs are analyzed in this paper . Since such a type of joins often involves document collections of ve ry large size, it is very important to find efficient algorithms to pr ocess them. The three algorithms differ on whether the documents thems elves or the inverted files on the documents are used to process the j oin. Our analysis and the simulation results indicate that the relativ e performance of these algorithms depends on the input document collec tions, system characteristics, and the input query. For each algorithm , the type of input document collections with which the algorithm is l ikely to perform well is identified. An integrated algorithm that auto matically selects the best algorithm to use is also proposed.