The literature of the evaluation of Internet search engines is reviewed. Al
though there have been many studies, there has been little consistency in t
he way such studies have been carried out. This problem is exacerbated by t
he fact that recall is virtually impossible to calculate in the fast changi
ng Internet environment, and therefore the traditional Cranfield type of ev
aluation is not usually possible. A variety of alternative evaluation metho
ds has been suggested to overcome this difficulty. The authors recommend th
at a standardised set of tools is developed for the evaluation of web searc
h engines so that, in future, comparisons can be made between search engine
s more effectively, and that variations in performance of any given search
engine over time can be tracked. The paper itself does not provide such a s
tandard set of tools, but it investigates the issues and makes preliminary
recommendations of the types of tools needed.