A NEW CRITERION IN SELECTION AND DISCRETIZATION OF ATTRIBUTES FOR THEGENERATION OF DECISION TREES

Citation
Bh. Jun et al., A NEW CRITERION IN SELECTION AND DISCRETIZATION OF ATTRIBUTES FOR THEGENERATION OF DECISION TREES, IEEE transactions on pattern analysis and machine intelligence, 19(12), 1997, pp. 1371-1375
Citations number
12
ISSN journal
01628828
Volume
19
Issue
12
Year of publication
1997
Pages
1371 - 1375
Database
ISI
SICI code
0162-8828(1997)19:12<1371:ANCISA>2.0.ZU;2-R
Abstract
It is important to use a better criterion in selection and discretizat ion of attributes for the generation of decision trees to construct a better classifier in the area of pattern recognition in order to intel ligently access huge amount of data efficiently. Two well-known criter ia are gain and gain ratio, both based on the entropy of partitions. W e propose in this paper a new criterion based also on entropy, and use both theoretical analysis and computer simulation to demonstrate that it works better than gain or gain ratio in a wide variety of situatio ns. We use the usual entropy calculation where the base of the logarit hm is not two but the number of successors to the node. Our theoretica l analysis leads some specific situations in which the new criterion w orks always better than gain or gain ratio, and the simulation result may implicitly cover all the other situations not covered by the analy sis.