Lt. Su, IS RELEVANCE AN ADEQUATE CRITERION FOR RETRIEVAL-SYSTEM EVALUATION - AN EMPIRICAL INQUIRY INTO THE USER EVALUATION, Proceedings of the ASIS annual meeting, 30, 1993, pp. 93-103
Citations number
27
Categorie Soggetti
Information Science & Library Science","Information Science & Library Science
Relevance is a key notion in information science and has been a major
criterion for evaluating IR system effectiveness since the 1960's. Eva
luation research effort through the 1970's concentrated on evaluating
subsystems or subprocesses of the systems using the most well-known re
levance-based measures: recall and precision. Different definitions or
interpretations of relevance have been proposed in the history of IR
evaluation. Other criteria have also been suggested and claimed to be
competitive or complementary for evaluation. However, many of the argu
ments or criticisms concerned with the various criteria were based on
logical grounds or simply speculations. Little empirical research has
been conducted on the relative importance of various criteria for eval
uating information retrieval performance. Recent resurge of interest o
n relevance research as evidenced by many contributed papers and the f
orming of the SIG/FIS-Relevance Group prompted the rethinking of an ol
d question: Is relevance an adequate criterion for retrieval system ev
aluation? This paper attempts to address this question by providing a
brief review of various thoughts related to this question found in IR
literature and by presenting some empirical evidence collected by Su i
n an earlier study [1] concerning user's assessment of retrieval syste
m, performance. A total of 26 success dimensions were identified throu
gh content analysis of 203 users' reasons for system success. This sug
gests that the user's judgment of system performance is a multi-dimens
ional assessment. Although relevance appears to be an important criter
ion, there are many other considerations affecting users' assessments
of-the system success. Dimensions or categories of success related to
as well as those not related to relevance are discussed. Results from
content analysis of verbal data are compared with those from factor an
alysis of quantitative data reported by Su [2]. Implications and futur
e research are also discussed.