In this paper we propose algorithms for generation of frequent item sets by
successive construction of the nodes of a lexicographic tree of item sets.
We discuss different strategies in generation and traversal of the lexicog
raphic tree such as breadth-first search, depth-first search, or a combinat
ion of the two. These techniques provide different trade-offs in terms of t
he I/O, memory, and computational time requirements. We use the hierarchica
l structure of the lexicographic tree to successively project transactions
at each node of the lexicographic tree and use matrix counting on this redu
ced set of transactions for finding frequent item sets. We tested our algor
ithm on both real and synthetic data. We provide an implementation of the t
ree projection method which is up to one order of magnitude faster than oth
er recent techniques in the literature. The algorithm has a well-structured
data access pattern which provides data locality and reuse of data for mul
tiple levels of the cache. We also discuss methods for parallelization of t
he TreeProjection algorithm. (C) 2001 Academic Press.