In this paper, we study the problem of constructing and maintaining a large
shared repository of Web pages. We discuss the unique characteristics of s
uch a repository, propose an architecture, and identify its functional modu
les. We focus on the storage manager module, and illustrate how traditional
techniques for storage and indexing can be tailored to meet the requiremen
ts of a Web repository. To evaluate design alternatives, we also present ex
perimental results from a prototype repository called WebBase, that is curr
ently being developed at Stanford University. (C) 2000 Published by Elsevie
r Science B.V. All rights reserved.