DEMON: Mining and monitoring evolving data

Citation
V. Ganti et al., DEMON: Mining and monitoring evolving data, IEEE KNOWL, 13(1), 2001, pp. 50-63
Citations number
30
Categorie Soggetti
AI Robotics and Automatic Control
Journal title
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
ISSN journal
10414347 → ACNP
Volume
13
Issue
1
Year of publication
2001
Pages
50 - 63
Database
ISI
SICI code
1041-4347(200101/02)13:1<50:DMAMED>2.0.ZU;2-Z
Abstract
Data mining algorithms have been the focus of much research recently. In pr actice, the input data to a data mining process resides in a large data war ehouse whose data is kept up-to-date through periodic or occasional additio n and deletion of blocks of data. Most data mining algorithms have either a ssumed that the input data is static, or have been designed for arbitrary i nsertions and deletions of data records. In this paper, we consider a dynam ic environment that evolves through systematic addition or deletion of bloc ks of data. We introduce a new dimension, called the data span dimension, w hich allows user-defined selections of a temporal subset of the database. T aking this new degree of freedom into account, we describe efficient model maintenance algorithms for frequent itemsets and clusters. We then describe a generic algorithm that takes any traditional incremental model maintenan ce algorithm and transforms it into an algorithm that allows restrictions o n the data span dimension. We also develop an algorithm for automatically d iscovering a specific class of interesting block selection sequences. In a detailed experimental study, we examine the validity and performance of our ideas on synthetic and real datasets.