ITA
ENG

The effect of pool depth on system evaluation in TREC

Authors

Keenan, S Smeaton, AF Keogh, G

Citation

S. Keenan et al., The effect of pool depth on system evaluation in TREC, J AM SOC IN, 52(7), 2001, pp. 570-574

Citations number

Categorie Soggetti

Library & Information Science

Journal title

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY

ISSN journal

15322882 → ACNP

Volume

Issue

Year of publication

2001

Pages

570 - 574

Database

ISI

SICI code

1532-2882(200105)52:7<570:TEOPDO>2.0.ZU;2-8

Abstract

The TREC benchmarking exercise for information retrieval (IR) experiments h as provided a forum and an opportunity for IR researchers to evaluate the p erformance of their approaches to the in task and has resulted in improveme nts in in effectiveness. Typically, retrieval performance has been measured in terms of precision and recall, and comparisons between different in app roaches have been based on these measures. These measures are in turn depen dent on the so-called "pool depth" used to discover relevant documents. Whe reas there is evidence to suggest that the pool depth size used for TREC ev aluations adequately identifies the relevant documents in the entire test d ata collection, we consider how it affects the evaluations of individual sy stems. The data used comes from the Sixth TREC conference, TREC-6. By fitti ng appropriate regression models we explore whether different pool depths c onfer advantages or disadvantages on different retrieval systems when they are compared. As a consequence of this model fitting, a pair of measures fo r each retrieval run, which are related to precision and recall, emerge. Fo r each system, these give an extrapolation for the number of relevant docum ents the system would have been deemed to have retrieved if an indefinitely large pool size had been used, and also a measure of the sensitivity of ea ch system to pool size. We concur that even on the basis of analyses of ind ividual systems, the pool depth of 100 used by TREC is adequate.