The Monte Carlo method and the evaluation of retrieval system performance

Authors
Citation
R. Burgin, The Monte Carlo method and the evaluation of retrieval system performance, J AM S INFO, 50(2), 1999, pp. 181-191
Citations number
38
Categorie Soggetti
Library & Information Science
Journal title
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE
ISSN journal
00028231 → ACNP
Volume
50
Issue
2
Year of publication
1999
Pages
181 - 191
Database
ISI
SICI code
0002-8231(199902)50:2<181:TMCMAT>2.0.ZU;2-A
Abstract
The ability to distinguish between acceptable and unacceptable levels of re trieval performance and the ability to distinguish between significant and non-significant differences between retrieval results are important to trad itional information retrieval experiments, The Monte Carlo method is shown to represent an attractive alternative to the hypergeometric model for iden tifying the levels at which random retrieval performance is exceeded in ret rieval test collections and for overcoming some of the limitations of the h ypergeometric model, The Monte Carlo method produces low performance thresh olds for the individual test collections that are very similar to the thres holds derived by the hypergeometric model, both at the test collection leve l and at the individual query level. In addition, the Monte Carlo method is much less computer-intensive than the hypergeometric model, can be used wi th measures of retrieval effectiveness that take the rank order of the retr ieved documents into consideration, can be used to derive the probability o f obtained results, and can be used to determine the statistical significan ce of difference between two or more retrieval results, The ability to use the Monte Carte method to derive the probability of obtained results and to compare two or more retrieval results makes it possible to determine more accurately how well retrieval systems operate under specific conditions and , in conjunction with the presentation of individual query results, makes i t possible to determine whether relationships between query characteristics and retrieval system performance exist. Understanding these relationships should lead to improvements in the effectiveness of retrieval systems.