Coping with missing attribute values based on closest fit in preterm birthdata: A rough set approach

Citation
Jw. Grzymala-busse et al., Coping with missing attribute values based on closest fit in preterm birthdata: A rough set approach, COMPUT INTE, 17(3), 2001, pp. 425-434
Citations number
19
Categorie Soggetti
AI Robotics and Automatic Control
Journal title
COMPUTATIONAL INTELLIGENCE
ISSN journal
08247935 → ACNP
Volume
17
Issue
3
Year of publication
2001
Pages
425 - 434
Database
ISI
SICI code
0824-7935(200108)17:3<425:CWMAVB>2.0.ZU;2-Z
Abstract
Data mining is frequently applied to data sets with missing attribute value s. A new approach to missing attribute values, called closest fit, is intro duced in this paper. In this approach, for a given case (example) with a mi ssing attribute value we search for another case that is as similar as poss ible to the given case. Cases can be considered as vectors of attribute val ues. The search is for the case that has as many as possible identical attr ibute values for symbolic attributes, or as the smallest possible value dif ferences for numerical attributes. There are two possible ways to conduct a search: within the same class (concept) as the case with the missing attri bute values, or for the entire set of all cases. For comparison, we also ex perimented with another approach to missing attribute values, where the mis sing values are replaced by the most common value of the attribute for symb olic attributes or by the average value for numerical attributes. All algor ithms were implemented in the system OOMIS. Our experiments were performed on the preterm birth data sets provided by the Duke University Medical Cent er.