IS RELEVANCE AN ADEQUATE CRITERION FOR RETRIEVAL-SYSTEM EVALUATION - AN EMPIRICAL INQUIRY INTO THE USER EVALUATION

Authors
Citation
Lt. Su, IS RELEVANCE AN ADEQUATE CRITERION FOR RETRIEVAL-SYSTEM EVALUATION - AN EMPIRICAL INQUIRY INTO THE USER EVALUATION, Proceedings of the ASIS annual meeting, 30, 1993, pp. 93-103
Citations number
27
Categorie Soggetti
Information Science & Library Science","Information Science & Library Science
ISSN journal
00447870
Volume
30
Year of publication
1993
Pages
93 - 103
Database
ISI
SICI code
0044-7870(1993)30:<93:IRAACF>2.0.ZU;2-C
Abstract
Relevance is a key notion in information science and has been a major criterion for evaluating IR system effectiveness since the 1960's. Eva luation research effort through the 1970's concentrated on evaluating subsystems or subprocesses of the systems using the most well-known re levance-based measures: recall and precision. Different definitions or interpretations of relevance have been proposed in the history of IR evaluation. Other criteria have also been suggested and claimed to be competitive or complementary for evaluation. However, many of the argu ments or criticisms concerned with the various criteria were based on logical grounds or simply speculations. Little empirical research has been conducted on the relative importance of various criteria for eval uating information retrieval performance. Recent resurge of interest o n relevance research as evidenced by many contributed papers and the f orming of the SIG/FIS-Relevance Group prompted the rethinking of an ol d question: Is relevance an adequate criterion for retrieval system ev aluation? This paper attempts to address this question by providing a brief review of various thoughts related to this question found in IR literature and by presenting some empirical evidence collected by Su i n an earlier study [1] concerning user's assessment of retrieval syste m, performance. A total of 26 success dimensions were identified throu gh content analysis of 203 users' reasons for system success. This sug gests that the user's judgment of system performance is a multi-dimens ional assessment. Although relevance appears to be an important criter ion, there are many other considerations affecting users' assessments of-the system success. Dimensions or categories of success related to as well as those not related to relevance are discussed. Results from content analysis of verbal data are compared with those from factor an alysis of quantitative data reported by Su [2]. Implications and futur e research are also discussed.