ITA
ENG

Finding related pages in the World Wide Web

Authors

Dean, J Henzinger, MR

Citation

J. Dean et Mr. Henzinger, Finding related pages in the World Wide Web, COMPUT NET, 31(11-16), 1999, pp. 1467-1479

Citations number

Categorie Soggetti

Information Tecnology & Communication Systems

Journal title

COMPUTER NETWORKS-THE INTERNATIONAL JOURNAL OF COMPUTER AND TELECOMMUNICATIONS NETWORKING

ISSN journal

13891286 → ACNP

Volume

Issue

11-16

Year of publication

1999

Pages

1467 - 1479

Database

ISI

SICI code

1389-1286(19990517)31:11-16<1467:FRPITW>2.0.ZU;2-E

Abstract

When using traditional search engines, users have to formulate queries to d escribe their information need. This paper discusses a different approach t o Web searching where the input to the search process is not a set of query terms, but instead is the URL of a page, and the output is a set of relate d Web pages. A related Web page is one that addresses the same topic as the original page. For example, www.washingtonpost.com is a page related to ww w.nytimes.com, since both are online newspapers. We describe two algorithms to identify related Web pages. These algorithms use only the connectivity information in the Web (i.e., the links between p ages) and not the content of pages or usage information. We have implemente d both algorithms and measured their runtime performance. To evaluate the e ffectiveness of our algorithms, we performed a user study comparing our alg orithms with Netscape's 'What's Related' service (http://home.netscape.com/ escapes/related/). Our study showed that the precision at 10 for our two al gorithms are 73% better and 51% better than that of Netscape, despite the f act that Netscape uses both content and usage pattern information in additi on to connectivity information. (C) 1999 Published by Elsevier Science B.V. All rights reserved.