ITA
ENG

CONTEXT-SENSITIVE FEATURE-SELECTION FOR LAZY LEARNERS

Authors

DOMINGOS P

Citation

P. Domingos, CONTEXT-SENSITIVE FEATURE-SELECTION FOR LAZY LEARNERS, Artificial intelligence review, 11(1-5), 1997, pp. 227-253

Citations number

Categorie Soggetti

Computer Sciences, Special Topics","Computer Science Artificial Intelligence

Journal title

Artificial intelligence review → ACNP

ISSN journal

02692821

Volume

Issue

1-5

Year of publication

1997

Pages

227 - 253

Database

ISI

SICI code

0269-2821(1997)11:1-5<227:CFFLL>2.0.ZU;2-L

Abstract

High sensitivity to irrelevant features is arguably the main shortcomi ng of simple lazy learners. In response to it, many feature selection methods have been proposed, including forward sequential selection (FS S) and backward sequential selection (BSS). Although they often produc e substantial improvements in accuracy, these methods select the same set of relevant features everywhere in the instance space, and thus re present only a partial solution to the problem. In general, some featu res will be relevant only in some parts of the space; deleting them ma y hurt accuracy in those parts, but selecting them will have the same effect in parts where they are irrelevant. This article introduces RC, a new feature selection algorithm that uses a clustering-like approac h to select sets of locally relevant features (i.e., the features it s elects may vary from one instance to another). Experiments in a large number of domains from the UCI repository show that RC almost always i mproves accuracy with respect to FSS and BSS, often with high signific ance, A study using artificial domains confirms the hypothesis that th is difference in performance is due to RC's context sensitivity, and a lso suggests conditions where this sensitivity will and will not be an advantage. Another feature of RC is that it is faster than FSS and BS S, often by an order of magnitude or more.