We have developed an efficient way to determine the syntactic similari
ty of files and have applied it to every document on the World Wide We
b. Using this mechanism, we built a clustering of all the documents th
at are syntactically similar. Possible applications include a ''Lost a
nd Found'' service, filtering the results of Web searches, updating wi
dely distributed web-pages, and identifying violations of intellectual
property rights. (C) 1997 Published by Elsevier Science B.V.