THE RELATIONSHIP BETWEEN RECALL AND PRECISION

Authors
Citation
M. Buckland et F. Gey, THE RELATIONSHIP BETWEEN RECALL AND PRECISION, Journal of the American Society for Information Science, 45(1), 1994, pp. 12-19
Citations number
14
Categorie Soggetti
Information Science & Library Science","Information Science & Library Science
ISSN journal
00028231
Volume
45
Issue
1
Year of publication
1994
Pages
12 - 19
Database
ISI
SICI code
0002-8231(1994)45:1<12:TRBRAP>2.0.ZU;2-H
Abstract
Empirical studies of retrieval performance have shown a tendency for P recision to decline as Recall increases. This article examines the nat ure of the relationship between Precision and Recall. The relationship s between Recall and the number of documents retrieved, between Precis ion and the number of documents retrieved, and between Precision and R ecall are described in the context of different assumptions about retr ieval performance. It is demonstrated that a tradeoff between Recall a nd Precision is unavoidable whenever retrieval performance is consiste ntly better than retrieval at random. More generally, for the Precisio n-Recall trade-off to be avoided as the total number of documents retr ieved increases, retrieval performance must be equal to or better than overall retrieval performance up to that point. Examination of the ma thematical relationship between Precision and Recall shows that a quad ratic Recall curve can resemble empirical Recall-Precision behavior if transformed into a tangent parabola. With very large databases and/or systems with limited retrieval capabilities there can be advantages t o retrieval in two stages: Initial retrieval emphasizing high Recall, followed by more detailed searching of the initially retrieved set, ca n be used to improve both Recall and Precision simultaneously. Even so , a tradeoff between Precision and Recall remains.