CONTEXT-SENSITIVE FEATURE-SELECTION FOR LAZY LEARNERS

Authors
Citation
P. Domingos, CONTEXT-SENSITIVE FEATURE-SELECTION FOR LAZY LEARNERS, Artificial intelligence review, 11(1-5), 1997, pp. 227-253
Citations number
38
Categorie Soggetti
Computer Sciences, Special Topics","Computer Science Artificial Intelligence
ISSN journal
02692821
Volume
11
Issue
1-5
Year of publication
1997
Pages
227 - 253
Database
ISI
SICI code
0269-2821(1997)11:1-5<227:CFFLL>2.0.ZU;2-L
Abstract
High sensitivity to irrelevant features is arguably the main shortcomi ng of simple lazy learners. In response to it, many feature selection methods have been proposed, including forward sequential selection (FS S) and backward sequential selection (BSS). Although they often produc e substantial improvements in accuracy, these methods select the same set of relevant features everywhere in the instance space, and thus re present only a partial solution to the problem. In general, some featu res will be relevant only in some parts of the space; deleting them ma y hurt accuracy in those parts, but selecting them will have the same effect in parts where they are irrelevant. This article introduces RC, a new feature selection algorithm that uses a clustering-like approac h to select sets of locally relevant features (i.e., the features it s elects may vary from one instance to another). Experiments in a large number of domains from the UCI repository show that RC almost always i mproves accuracy with respect to FSS and BSS, often with high signific ance, A study using artificial domains confirms the hypothesis that th is difference in performance is due to RC's context sensitivity, and a lso suggests conditions where this sensitivity will and will not be an advantage. Another feature of RC is that it is faster than FSS and BS S, often by an order of magnitude or more.