A probabilistic approach to navigation in Hypertext

Citation
M. Levene et G. Loizou, A probabilistic approach to navigation in Hypertext, INF SCI, 114(1-4), 1999, pp. 165-186
Citations number
28
Categorie Soggetti
Information Tecnology & Communication Systems
Journal title
INFORMATION SCIENCES
ISSN journal
00200255 → ACNP
Volume
114
Issue
1-4
Year of publication
1999
Pages
165 - 186
Database
ISI
SICI code
0020-0255(199903)114:1-4<165:APATNI>2.0.ZU;2-O
Abstract
One of the main unsolved problems confronting Hypertext is the navigation p roblem, namely the problem of having to know where you are in the database graph representing the structure of a Hypertext database, and knowing how t o get to some other place you are searching for in the database graph. Prev iously we formalised a Hypertext database in terms of a directed graph whos e nodes represent pages of information. The notion of a trail, which is a p ath in the database graph describing some logical association amongst the p ages in the trail, is central to our model. We defined a Hypertext Query La nguage, HQL, over Hypertext databases and showed that in general the naviga tion problem, i.e. the problem of finding a trail that satisfies a HQL quer y (technically known as the model checking problem), is NP-complete. Herein we present a preliminary investigation of using a probabilistic approach i n order to enhance the efficiency of model checking. The flavour of our inv estigation is that if we have some additional statistical information about the Hypertext database then we can utilise such information during query p rocessing. We present two different approaches. The first approach utilises the theory of probabilistic automata. In this approach we view a Hypertext database as a probabilistic automaton, which we call a Hypertext probabili stic automaton. In such an automaton we assume that the probability of trav ersing a link is determined by the usage statistics of that link. We exhibi t a special case when the number of trails that satisfy a query is always f inite and indicate how to give a finite approximation of answering a query in the general case. The second approach utilises the theory of random Turi ng machines. In this approach we view a Hypertext database as a probabilist ic algorithm, realised via a Hypertext random automaton. In such an automat on we assume that out of a choice of links, traversing any one of them is e qually likely. We obtain the lower bound of the probability that a random t rail satisfies a query. In principle, by iterating this probabilistic algor ithm, associated with the Hypertext database, the probability of finding a trail that satisfies the query can be made arbitrarily large. (C) 1999 Else vier Science Inc. All rights reserved.