A. Tomasic et H. Garciamolina, PERFORMANCE ISSUES IN DISTRIBUTED SHARED-NOTHING INFORMATION-RETRIEVAL SYSTEMS, Information processing & management, 32(6), 1996, pp. 647-665
Citations number
15
Categorie Soggetti
Information Science & Library Science","Information Science & Library Science","Computer Science Information Systems
Many information-retrieval systems provides access to abstracts. For e
xample, Stanford University, through its FOLIO system, provides access
to the INSPEC database of abstracts of the literature on physics, com
puter science, electrical engineering, etc. In this article, this data
base is studied by using a trace-driven simulation. It focuses on a ph
ysical-index design that accommodates truncations, inverted-index cach
ing, and database scaling in a distributed shared-nothing system. All
three issues are shown to have a strong effect on response time and th
roughput. Database scaling is explored in two ways. One way assumes an
''optimal'' configuration for a single host and then linearly scales
the database by duplicating the host architecture as needed. The secon
d way determines the optimal number of hosts given a fixed database si
ze. Copyright (C) 1996 Elsevier Science Ltd