SALSA: The stochastic approach for link-structure analysis

Citation
R. Lempel et S. Moran, SALSA: The stochastic approach for link-structure analysis, ACM T INF S, 19(2), 2001, pp. 131-160
Citations number
25
Categorie Soggetti
Information Tecnology & Communication Systems
Journal title
ACM TRANSACTIONS ON INFORMATION SYSTEMS
ISSN journal
10468188 → ACNP
Volume
19
Issue
2
Year of publication
2001
Pages
131 - 160
Database
ISI
SICI code
1046-8188(200104)19:2<131:STSAFL>2.0.ZU;2-P
Abstract
Today, when searching for information on the WWW, one usually performs a qu ery through a term-based search engine. These engines return, as the query' s result, a list of Web pages whose contents matches the query. For broad-t opic queries, such searches often result in a huge set of retrieved documen ts, many of which are irrelevant to the user. However, much information is contained in the link-structure of the WWW. Information such as which pages are linked to others can be used to augment search algorithms. In this con text, Jon Kleinberg introduced the notion of two distinct types of Web page s: hubs and authorities. Kleinberg argued that hubs and authorities exhibit a mutually reinforcing relationship: a good hub will point to many authori ties, and a good authority will be pointed at by many hubs. In light of thi s, he devised an algorithm aimed at finding authoritative pages. We present SALSA, a new stochastic approach for link-structure analysis, which examin es random walks on graphs derived from the link-structure. We show that bot h SALSA and Kleinberg's Mutual Reinforcement approach employ the same metaa lgorithm. We then prove that SALSA is equivalent to a weighted in-degree an alysis of the link-structure of WWW subgraphs, making it computationally mo re efficient than the Mutual Reinforcement approach. We compare the results of applying SALSA to the results derived through Kleinberg's approach. The se comparisons reveal a topological phenomenon called the TKC Effect which, in certain cases, prevents the Mutual Reinforcement approach from identify ing meaningful authorities.