ITA
ENG

Theory of dependence values

Authors

Meo, R

Citation

R. Meo, Theory of dependence values, ACM T DATAB, 25(3), 2000, pp. 380-406

Citations number

Categorie Soggetti

Computer Science & Engineering

Journal title

ACM TRANSACTIONS ON DATABASE SYSTEMS

ISSN journal

03625915 → ACNP

Volume

Issue

Year of publication

2000

Pages

380 - 406

Database

ISI

SICI code

0362-5915(200009)25:3<380:TODV>2.0.ZU;2-V

Abstract

A new model to evaluate dependencies in data mining problems is presented a nd discussed. The well-known concept of the association rule is replaced by the new definition of dependence value, which is a single real number uniq uely associated with a given itemset. Knowledge of dependence values is suf ficient to describe all the dependencies characterizing a given data mining problem. The dependence value of an itemset is the difference between the occurrence probability of the itemset and a corresponding "maximum independ ence estimate." This can be determined as a function of joint probabilities of the subsets of the itemset being considered by maximizing a suitable en tropy function. So it is possible to separate in an itemset of cardinality k the dependence inherited from its subsets of cardinality (k - 1) and the specific inherent dependence of that itemset. The absolute value of the dif ference between the probability p(i) of the event i that indicates the pres ence of the itemset {a,b,...} and its maximum independence estimate is cons tant for any combination of values of(a, b,... ). In addition, the Boolean function specifying the combinations of values for which the dependence is positive is a parity function. So the determination of such combinations is immediate. The model appears to be simple and powerful.