This paper examines the statistical distribution of relevance judgment
s by reviewing and comparing the results of several studies over the l
ast three decades. A characteristic distribution is found, which appea
rs to be quite consistent over experimental setting and intent, althou
gh some variations are found. This distribution is highly positively s
kewed. with a flat tail and a small upturn at the high end. Possible r
easons for this shape are discussed, and a hypothesis is offered to ex
plain this phenomenon based on human judging characteristics and the n
ature of relevance experimentation.