USING COOLLISTS TO INDEX HTML DOCUMENTS IN THE WEB

Authors
Citation
Jg. Lim, USING COOLLISTS TO INDEX HTML DOCUMENTS IN THE WEB, Computer networks and ISDN systems, 28(1-2), 1995, pp. 147-154
Citations number
8
Categorie Soggetti
Computer Sciences","System Science",Telecommunications,"Engineering, Eletrical & Electronic","Computer Science Information Systems
ISSN journal
01697552
Volume
28
Issue
1-2
Year of publication
1995
Pages
147 - 154
Database
ISI
SICI code
0169-7552(1995)28:1-2<147:UCTIHD>2.0.ZU;2-S
Abstract
This paper suggests a partial solution (limited to HTML documents) to the Web-indexing problem using Coo[lists. Roughly, a Coollist is equiv alent to a Hotlist in Mosaic except that it automatically records all the visited HTML document titles by default. Thus, in theory, by maint aining a merged list of everybody's Coollists, a complete index of all the HTML files in the Web should be created eventually. In practice, even if transferring everybody's Coollists to a single site were feasi ble, the growth and change rate of Web questions us whether the archie metaphor of ''every index server maintains all the know-wheres'' coul d be applied to the rest of the Web. The new metaphor we are suggestin g is a library metaphor. Let each organization maintain the merged Coo llists of their individuals. If some organization has surplus computin g resources, let it maintain the merged list of other merged lists. Th is way, individuals are likely to find documents of their interest fro m their own organization. But organizations have characteristics like libraries have specialities. Therefore, individuals will find other in teresting documents from its ''neighboring'' sites. Bigger libraries c arry more books. Likewise, there will be sites that merge many merged lists together which will be useful for blind keyword searching of the titles. For our current implementation of a Coollist, we take advanta ge of CERN proxy-cache server to collect the indices of all the visite d HTML files. People on three of the 19 plants within the company trie d the merged list of Coollists and found it almost indispensable. Peop le who used to save almost every URLs they visited and those who wante d some comprehensive list of URLs found it particularly useful. In the paper, we describe the result of our experiment in detail and also po int out how our approach might solve the scaleability problem of other Web indexing solutions.