A. Kumar, NEW TECHNIQUES FOR DATA REDUCTION IN A DATABASE SYSTEM FOR KNOWLEDGE DISCOVERY APPLICATIONS, JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 10(1), 1998, pp. 31-48
Citations number
25
Categorie Soggetti
Computer Science Information Systems","Computer Science Artificial Intelligence","Computer Science Information Systems","Computer Science Artificial Intelligence
Databases store large amounts of information about consumer transactio
ns and other kinds of transactions. This information can be used to de
duce rules about consumer behavior, and the rules can in rum be used t
o determine company policies, for instance with regards to production,
marketing and in several other areas. Since databases typically store
millions of records, and each record could have up to 100 or more att
ributes, as an initial step it is necessary to reduce the size of the
database by eliminating attributes that do not influence the decision
at all or do so very minimally. In this paper we present techniques th
at can be employed effectively for exact and approximate reduction in
a database system. These techniques can be implemented efficiently in
a database system using SQL (structured query language) commands. We t
ested their performance on a real data set and validated them. The res
ults showed that the classification performance actually improved with
a reduced set of attributes as compared to the case when all the attr
ibutes were present. We also discuss how our techniques differ from st
atistical methods and other data reduction methods such as rough sets.