Linear hash functions

Citation
N. Alon et al., Linear hash functions, J ACM, 46(5), 1999, pp. 667-683
Citations number
17
Categorie Soggetti
Computer Science & Engineering
Journal title
Volume
46
Issue
5
Year of publication
1999
Pages
667 - 683
Database
ISI
SICI code
Abstract
Consider the set H of all linear (or affine) transformations between two ve ctor spaces over a finite field F. We study how good SE is as a class of ha sh functions, namely we consider hashing a set S of size n into a range hav ing the same cardinality n by a randomly chosen function from H and look at the expected size of the largest hash bucket. H is a universal class of ha sh functions for any finite field. but with respect to our measure differen t fields behave differently. If the finite field F has n elements, then there is a bad set S subset of F -2 of size n with expected maximal bucket size Ohm(n1/3). If n is a perfect square, then there is even a bad set with largest bucket size always at le ast root n. (This is worst possible, since with respect to a universal clas s of hash functions every set of size n has expected largest bucket size be low root n + 1/2.) If, however, we consider the field of two elements, then we yet much better bounds. The best previously known upper bound on the expected size of the largest bucket for this class was O(2(root log n)). We reduce this upper bo und to O(log n log log n). Note that this is not far from the guarantee for a random function. There. the average largest bucket would be theta(log ni log log n). In the course of our proof we develop a tool which may be of independent in terest. Suppose we have a subset S of a vector space D over Z(2), and consi der a random linear mapping of D to a smaller vector space R. If the cardin ality of S is larger than c(epsilon)\R\log\R\, then with probability 1 - ep silon, the image of S will cover all elements in the range.