SYNTACTIC CLUSTERING OF THE WEB

Citation
Az. Broder et al., SYNTACTIC CLUSTERING OF THE WEB, Computer networks and ISDN systems, 29(8-13), 1997, pp. 1157-1166
Citations number
7
Categorie Soggetti
Computer Sciences","System Science",Telecommunications,"Engineering, Eletrical & Electronic","Computer Science Information Systems
ISSN journal
01697552
Volume
29
Issue
8-13
Year of publication
1997
Pages
1157 - 1166
Database
ISI
SICI code
0169-7552(1997)29:8-13<1157:SCOTW>2.0.ZU;2-J
Abstract
We have developed an efficient way to determine the syntactic similari ty of files and have applied it to every document on the World Wide We b. Using this mechanism, we built a clustering of all the documents th at are syntactically similar. Possible applications include a ''Lost a nd Found'' service, filtering the results of Web searches, updating wi dely distributed web-pages, and identifying violations of intellectual property rights. (C) 1997 Published by Elsevier Science B.V.