NEW TECHNIQUES FOR DATA REDUCTION IN A DATABASE SYSTEM FOR KNOWLEDGE DISCOVERY APPLICATIONS

Authors
Citation
A. Kumar, NEW TECHNIQUES FOR DATA REDUCTION IN A DATABASE SYSTEM FOR KNOWLEDGE DISCOVERY APPLICATIONS, JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 10(1), 1998, pp. 31-48
Citations number
25
Categorie Soggetti
Computer Science Information Systems","Computer Science Artificial Intelligence","Computer Science Information Systems","Computer Science Artificial Intelligence
ISSN journal
09259902
Volume
10
Issue
1
Year of publication
1998
Pages
31 - 48
Database
ISI
SICI code
0925-9902(1998)10:1<31:NTFDRI>2.0.ZU;2-S
Abstract
Databases store large amounts of information about consumer transactio ns and other kinds of transactions. This information can be used to de duce rules about consumer behavior, and the rules can in rum be used t o determine company policies, for instance with regards to production, marketing and in several other areas. Since databases typically store millions of records, and each record could have up to 100 or more att ributes, as an initial step it is necessary to reduce the size of the database by eliminating attributes that do not influence the decision at all or do so very minimally. In this paper we present techniques th at can be employed effectively for exact and approximate reduction in a database system. These techniques can be implemented efficiently in a database system using SQL (structured query language) commands. We t ested their performance on a real data set and validated them. The res ults showed that the classification performance actually improved with a reduced set of attributes as compared to the case when all the attr ibutes were present. We also discuss how our techniques differ from st atistical methods and other data reduction methods such as rough sets.