One of the most important problems in information retrieval is determining
the order of documents in the answer returned to the user. Many methods and
algorithms fur document ordering have been proposed. The method introduced
in this paper differs from them especially in that it uses a probabilistic
model of a document set. In this model documents are regarded as states of
a Markov chain, where transition probabilities are directly proportional t
o similarities between documents. Steady-state probabilities reflect simila
rities of particular documents to the whole answer set. If documents are or
dered according to these probabilities, at the top of a list there will be
documents that are the best representatives of the set, and at the bottom t
hose which are the worst representatives. The method was tested against dat
abases INSPEC and Networked Computer Science Technical Reference Library (N
CSTRL). Test results are positive. Values of the Kendall rank correlation c
oefficient indicate high similarity between rankings generated by the propo
sed method and rankings produced by experts. Results are comparable with ra
nkings generated by the vector model using standard weighting schema tf . i
df. (C) 2001 Elsevier Science Ltd. All rights reserved.