R. Ghoshroy et al., ONLINE LEGAL-AID - MARKOV-CHAIN MODEL FOR EFFICIENT RETRIEVAL OF LEGAL DOCUMENTS, Image and vision computing, 16(12-13), 1998, pp. 941-946
It is widely accepted that, with large databases, the key to good perf
ormance is effective data-clustering. In any large document database c
lustering is essential for efficient search, browse and therefore retr
ieval. Cluster analysis allows the identification of groups, or cluste
rs, of similar objects in multi-dimensional space [1]. Conventional do
cument retrieval systems involve the matching of a query against indiv
idual documents, whereas a clustered search compares a query with clus
ters of documents, thereby achieving efficient retrieval. In most docu
ment databases, periodic updating of clusters is required due to the d
ynamic nature of a database. Experimental evidence, however, shows tha
t clustered searches are substantially less effective than conventiona
l searches of corresponding non-clustered documents. In this paper, we
investigate the present clustering criteria and its drawbacks. We pro
pose a new approach to clustering and justify the reasons why this new
approach should be tested and (if proved beneficial) adopted. (C) 199
8 Elsevier Science B.V. All rights reserved.