PERFORMANCE ISSUES IN DISTRIBUTED SHARED-NOTHING INFORMATION-RETRIEVAL SYSTEMS

Citation
A. Tomasic et H. Garciamolina, PERFORMANCE ISSUES IN DISTRIBUTED SHARED-NOTHING INFORMATION-RETRIEVAL SYSTEMS, Information processing & management, 32(6), 1996, pp. 647-665
Citations number
15
Categorie Soggetti
Information Science & Library Science","Information Science & Library Science","Computer Science Information Systems
ISSN journal
03064573
Volume
32
Issue
6
Year of publication
1996
Pages
647 - 665
Database
ISI
SICI code
0306-4573(1996)32:6<647:PIIDSI>2.0.ZU;2-F
Abstract
Many information-retrieval systems provides access to abstracts. For e xample, Stanford University, through its FOLIO system, provides access to the INSPEC database of abstracts of the literature on physics, com puter science, electrical engineering, etc. In this article, this data base is studied by using a trace-driven simulation. It focuses on a ph ysical-index design that accommodates truncations, inverted-index cach ing, and database scaling in a distributed shared-nothing system. All three issues are shown to have a strong effect on response time and th roughput. Database scaling is explored in two ways. One way assumes an ''optimal'' configuration for a single host and then linearly scales the database by duplicating the host architecture as needed. The secon d way determines the optimal number of hosts given a fixed database si ze. Copyright (C) 1996 Elsevier Science Ltd