Data mining is an important real-life application for businesses. It i
s critical to find efficient ways of mining large data sets. In order
to benefit from the experience with relational databases, a set-orient
ed approach to mining data is needed. In such an approach, the data mi
ning operations are expressed in terms of relational or set-oriented o
perations. Query optimization technology can then be used for efficien
t processing. In this paper, we describe set-oriented algorithms for m
ining association rules. Such algorithms imply performing multiple joi
ns and thus may appear to be inherently less efficient than special-pu
rpose algorithms. We develop new algorithms that can be expressed as S
QL queries, and discuss optimization of these algorithms. After analyt
ical evaluation, an algorithm named SETM emerges as the algorithm of c
hoice. Algorithm SETM uses only simple database primitives, viz., sort
ing and merge-scan join. Algorithm SETM is simple, fast, and stable ov
er the range of parameter values. It is easily parallelized and we sug
gest several additional optimizations. The set-oriented nature of Algo
rithm SETM makes it possible to develop extensions easily and its perf
ormance makes it feasible to build interactive data mining tools for l
arge databases.