ITA
ENG

BIAS IN INFORMATION-BASED MEASURES IN DECISION TREE INDUCTION

Authors

WHITE AP LIU WZ

Citation

Ap. White et Wz. Liu, BIAS IN INFORMATION-BASED MEASURES IN DECISION TREE INDUCTION, Machine learning, 15(3), 1994, pp. 321-329

Citations number

Categorie Soggetti

Computer Sciences","Computer Science Artificial Intelligence",Neurosciences

Journal title

Machine learning → ACNP

ISSN journal

08856125

Volume

Issue

Year of publication

1994

Pages

321 - 329

Database

ISI

SICI code

0885-6125(1994)15:3<321:BIIMID>2.0.ZU;2-8

Abstract

A fresh look is taken at the problem of bias in information-based attr ibute selection measures, used in the induction of decision trees. The approach uses statistical simulation techniques to demonstrate that t he usual measures such as information gain, gain ratio, and a new meas ure recently proposed by Lopez de Mantaras (1991) are all biased in fa vour of attributes with large numbers of values. It is concluded that approaches which utilise the chi-square distribution are preferable be cause they compensate automatically for differences between attributes in the number of levels they take.