The searching of protein databases as a method of identifying newly se
quenced genes is commonplace in molecular biology laboratories. Howeve
r, it is a procedure that is not usually formally taught to students,
and method cookbooks discuss it only briefly. This article uses a sing
le family of highly diverged uracil-DNA glycosylases, which fall into
two distinct groups, to highlight some of the difficulties associated
with identification of such proteins by database searching.