The selection of unbiased representatives from a large database is com
plicated by the requirement for the chosen entries to be not only genu
inely different from each other but also typical for the family of rel
ated entries. A method satisfying this 2-fold objective was developed
by equipping complete linkage clustering with a novel noise eliminatio
n procedure to deal with overlapping cluster structure, A total of 200
nuclear families of truly related Brookhaven Protein Data Bank struct
ures were generated, from which any entry can be chosen to represent i
ts family.