How to generate Association Rules from Frequent Itemsets?
Generating Association Rules from Frequent Itemsets
Once the frequent itemsets from transactions in a database D have been found, it is straightforward to generate strong association rules from them (where strong association rules satisfy both minimum support and minimum confidence). This can be done using Eq. (6.4) for confidence, which we show again here for completeness: support_count (AUB)
confidence (A⇒ B)=P(B|A) = support_count (A)/support_count (B)
The conditional probability is expressed in terms of itemset support count, where support_count (AUB) is the number of transactions containing the itemsets AUB, and support count(A) is the number of transactions containing the itemset A. Based on this equation, association rules can be generated as follows:
■ For each frequent itemset I, generate all nonempty subsets of I.
■ For every nonempty subset s of I, output the rule "s⇒ (I-s)" if support_count(I) /support_count(s)>=min_conf, where min_conf is the minimum confidence threshold.
Because the rules are generated from frequent itemsets, each one automatically satisfies the minimum support. Frequent itemsets can be stored ahead of time in hash tables along with their counts so that they can be accessed quickly.
Example Generating association rules. Let's try an example based on the transactional data for AllElectronics shown before in Table 6.1.
The data contain frequent itemset X = {I1, I2, I5). What are the association rules that can be generated from X? The nonempty subsets of X are (I1, I2), (I1, I5}, {I2, I5}, {I1}, {I2}, and (I5). The resulting association rules are as shown below, each listed with its confidence:
{I1, I2) ⇒I5, confidence = 2/4= 50%
(I1,I5) ⇒I2, confidence = 2/2 = 100%
(I2, I5) I1, confidence = 2/2 = 100%
I1⇒ (I2,I5), confidence = 2/6 33%
I2 (I1,I5), confidence = 2/7 = 29%
I5⇒ (I1,I2), confidence = 2/2 = 100%
If the minimum confidence threshold is, say, 70%, then only the second, third, and last rules are output, because these are the only ones generated that are strong. Note that, unlike conventional classification rules, association rules can contain more than one conjunct in the right side of the rule.
OR,
Comments
Post a Comment