To find the optimal branching of a nominal attribute at a node in an L-ary
decision tree, one is often forced to search over all possible L-ary partit
ions for the one that yields the minimum impurity measure. For binary trees
(L = 2) when there are just two classes a short-cut search is possible tha
t is linear in n, the number of distinct values of the attribute. For the g
eneral case in which the number of classes, k, may be greater than two, Bur
shtein et al. have shown that the optimal partition satisfies a condition t
hat involves the existence of (L\atop 2) hyperplanes in the class probabili
ty space. We derive a property of the optimal partition for concave impurit
y measures (including in particular the Gini and entropy impurity measures)
in terms of the existence of L vectors in the dual of the class probabilit
y space, which implies the earlier condition.
Unfortunately, these insights still do not offer a practical search method
when n and k are large, even for binary trees. We therefore present a new h
euristic search algorithm to find a good partition. It is based on ordering
the attribute's values according to their principal component scores in th
e class probability space, and is linear in n. We demonstrate the effective
ness of the new method through Monte Carlo simulation experiments and compa
re its performance against other heuristic methods.