Recent work in supervised learning has shown that a surprisingly simpl
e Bayesian classifier with strong assumptions of independence among fe
atures, called naive Bayes is competitive with state-of-the-art classi
fiers such as C4.5. This fact raises the question of whether a classif
ier with less restrictive assumptions can perform even better. In this
paper we evaluate approaches for inducing classifiers from data, base
d on the theory of learning Bayesian networks. These networks are fact
ored representations of probability distributions that generalize the
naive Bayesian classifier and explicitly represent statements about in
dependence. Among these approaches we single out a method we call Tree
Augmented Naive Bayes (TAN), which outperforms naive Bayes, yet at th
e same time maintains the computational simplicity (no search involved
) and robustness that characterize naive Bayes. We experimentally rest
ed these approaches, using problems from the University of California
at Irvine repository, and compared them to C4.5, naive Bayes, and wrap
per methods for feature selection.