ITA
ENG

THE RELEVANCE OF RECALL AND PRECISION IN USER EVALUATION

Authors

SU LT

Citation

Lt. Su, THE RELEVANCE OF RECALL AND PRECISION IN USER EVALUATION, Journal of the American Society for Information Science, 45(3), 1994, pp. 207-217

Citations number

Categorie Soggetti

Information Science & Library Science","Information Science & Library Science

Journal title

Journal of the American Society for Information Science → ACNP

ISSN journal

00028231

Volume

Issue

Year of publication

1994

Pages

207 - 217

Database

ISI

SICI code

0002-8231(1994)45:3<207:TRORAP>2.0.ZU;2-F

Abstract

The appropriateness of evaluation criteria and measures have been a su bject of debate and a vital concern in the information retrieval evalu ation literature. A study was conducted to investigate the appropriate ness of 20 measures for evaluating interactive information retrieval p erformance, representing four major evaluation criteria. Among the 20 measures studied were the two most well-known relevance-based measures of effectiveness, recall and precision. The user's judgment of inform ation retrieval success was used as the devised criterion measure with which all other 20 measures were to be correlated. A sample of 40 end -users with individual information problems from an academic environme nt were observed, interacting with six professional intermediaries sea rching on their behalf in large operational systems. Quantitative data consisting of values for all measures studied and verbal data contain ing users' reasons for assigning certain values to selected measures w ere collected. Statistical analysis of the quantitative data showed th at precision, one of the most important traditional measures of effect iveness, is not significantly correlated with the user's judgment of s uccess. Users appear to be more concerned with absolute recall than wi th precision, although absolute recall was not directly tested in the study. Four related measures of recall and precision are found to be s ignificantly correlated with success. Among these are user's satisfact ion with completeness of search results and user's satisfaction with p recision of the search. This article explores the possible explanation s for this outcome through content analysis of users' verbal data. The analysis shows that high precision does not always mean high quality (relevancy, completeness, etc.) to users because of different users' e xpectations. The user's purpose in obtaining information is suggested to be the primary cause for the high concern for recall. Implications for research and practice are discussed.