Design of soft computing models for data mining applications

Citation
S. Sumathi et Sivanandam, Sn",jagadeeswari, Design of soft computing models for data mining applications, I J ENG M S, 7(3), 2000, pp. 107-121
Citations number
19
Categorie Soggetti
Engineering Management /General
Journal title
INDIAN JOURNAL OF ENGINEERING AND MATERIALS SCIENCES
ISSN journal
09714588 → ACNP
Volume
7
Issue
3
Year of publication
2000
Pages
107 - 121
Database
ISI
SICI code
0971-4588(200006)7:3<107:DOSCMF>2.0.ZU;2-Z
Abstract
Although modern technologies enable storage of large streams of data, but t here is no technology which can help to understand, analyze and visualize t he hidden information in the data. Data mining also called as data or knowl edge discovery is the process of analyzing data from different perspectives and summarizing it into useful information. Data mining software is one of a number of analytical tools for analyzing data It allows users to analyze data from many different dimensions or angles, categorize it, and summariz es the relationships identified. Pattern classification is one particular c ategory of data mining, which enables the discovery of knowledge from very large databases (VLDB). Data mining can be applied to a wide range of appli cations such as business forecasting, decision support systems, SONAR, RADA R, SEISMIC and medical diagnosis. Artificial neural networks are used to mine the database which has better n oise immunity and lesser training time. A self-organizing neural network ar chitecture called predictive ART or ARTMAP is introduced that is capable of fast stable learning, hypothesis testing in response to arbitrary stream o f input patterns. A generalization of binary ARTMAP is the fuzzy ARTMAP, wh ich learns to classify input by a pattern of fuzzy membership values betwee n 0 and 1, indicating the extent to which each feature is present. Generali zation of fuzzy ARTMAP is the Cascade ARTMAP which has pre-existing symboli c rules that are used to initialize the network before learning so that the network efficiency is increased. This rule insertion also provides knowled ge to the network that cannot be captured by training examples. Interpretat ion of knowledge learned by this neural network leads to compact and simple r rules compared to Back propagation approach. Another self-organizing algo rithm is proposed using Kohonen Architecture which also requires lesser tim e and high prediction accuracy compared to BPN. Moreover, the rules extract ed from this network are very simple compared to BPN approach. Finally, the extracted rules have been validated for their correctness. This approach i s most widely used in the Medical Industry for correct prediction when the database is large in size. At this time, the manual mining on such a volumi nous data is very difficult and also a very time consuming process. Sometim es it may lead to incorrect predictions. Henceforth, the data mining softwa re is developed. The performance evaluation of all three networks namely, C ascade ARTMAP, Fuzzy ARTMAP and Kohonen have been done and compared with co nventional methods. Simulation is carried out using the medical data bases taken from the UCI repository of machine learning data bases. The developed data mining software can also be used for other applications like Web, com munications, and pattern recognition.