Rl. Demantaras et al., COMPARING INFORMATION-THEORETIC ATTRIBUTE SELECTION MEASURES - A STATISTICAL APPROACH, AI communications, 11(2), 1998, pp. 91-100
In [7], a new information-theoretic attribute selection method for dec
ision tree induction was introduced. This method consists in computing
for each node, a distance between the partition generated by the valu
es of each candidate attribute in the nude and the correct partition o
f the subset of training examples in this node. The chosen attribute i
s that whose corresponding partition is the closest to the correct par
tition (i.e., the partition that perfectly classifies the training dat
a). In that paper it was also formally proved that such distance is no
t biased towards attributes with a large number of values in the sense
specified by Quinlan in [12] and some initial experimental evidence s
uggests that the predictive accuracy of the induced trees was not sign
ificantly different from that obtained with the most widely used infor
mation theoretic attribute selection measures, that is, Quinlan's Gain
and Quinlan's Cain Ratio. However, it seemed that the distance induce
d smaller trees especially when the attributes had different number of
values. In that paper it was not confirmed that the differences were
statistically significant due to the small number of experiments perfo
rmed. In this paper we report experimental results that allow to confi
rm that the distance induces trees whose size, without losing accuracy
, is not significantly different from those obtained using Quinlan's C
ain but smaller than those obtained with Quinlan's Gain Ratio. These e
xperimental results are supported by a statistical analysis performed
using two statistical hypothesis tests: the sign lest and the signed r
ank test.