The content and access dynamics of a busy Web site: Findings and implications

Citation
Vn. Padmanabhan et L. Qiu, The content and access dynamics of a busy Web site: Findings and implications, COMP COM R, 30(4), 2000, pp. 111-123
Citations number
26
Categorie Soggetti
Information Tecnology & Communication Systems
Journal title
SIGCOMM computer communication review
ISSN journal
01464833 → ACNP
Volume
30
Issue
4
Year of publication
2000
Pages
111 - 123
Database
ISI
SICI code
0146-4833(200010)30:4<111:TCAADO>2.0.ZU;2-J
Abstract
In this paper, we study the dynamics of the MSNBC news site, one of the bus iest Web sites in the Internet today. Unlike many other efforts that have a nalyzed client accesses as seen by proxies, we focus on the server end. We analyze the dynamics of both the server content and client accesses made to the server. The former considers the content creation and modification pro cess while the latter considers page popularity and locality in client acce sses. Some of our key results are: (a) files tend to change little when the y are modified, (b) a small set of files tends to get modified repeatedly, (c) file popularity follows a Zipf-like distribution with a parameter alpha that is much larger than reported in previous, proxy-based studies, and (d ) there is significant temporal stability in file popularity but not much s tability in the domains from which clients access the popular content. We d iscuss the implications of these findings for techniques such as Web cachin g (including cache consistency algorithms), and prefetching or server-based "push" of Web content.