Explain Association Rule with example.

 Association Rule

  • Proposed by Agrawal et al in 1993.
  • It is an important data mining model studied extensively by the database and data mining community.
  • Assume all data are categorical.
  • No good algorithm for numeric data.
  • Initially used for Market Basket Analysis to find how items purchased by customers are related.
  • Given a set of records each of which contains some number of items from a given collection;

– Produce dependency rules which will predict the occurrence of an item based on occurrences of other items.

                                      OR,

  • Association is one of the best-known data mining techniques. In association, a pattern is discovered based on a relationship between items in the same transaction. That’s is the reason why association technique is also known as relation technique. The association technique is used in market basket analysis to identify a set of products that customers frequently purchase together.
  • Retailers are using association technique to research customers’ buying habits. Based on historical sale data, retailers might find out that customers always buy crisps when they buy beers, and, therefore, they can put beers and crisps next to each other to save time for the customer and increase sales.



Applications:

Basket data analysis, cross-marketing, catalog design, loss-leader analysis, clustering, classification, etc.

E.g., 98% of people who purchase tires and auto accessories also get automotive services done 


Concepts:

An item: an item/article in a basket

I: the set of all items sold in the store

A transaction: items purchased in a basket; it may have TID (transaction ID)

A transactional dataset: A set of transactions







Mining Association Rules:

What We Need to Know

Goal: Rules with high support/confidence
How to compute?
Support: Find sets of items that occur frequently
Confidence: Find frequency of subsets of supported itemsets

If we have all frequently occurring sets of items (frequent itemsets), we can compute support and confidence!

Comments

Popular posts from this blog

Suppose that a data warehouse consists of the four dimensions; date, spectator, location, and game, and the two measures, count and charge, where charge is the fee that a spectator pays when watching a game on a given date. Spectators may be students, adults, or seniors, with each category having its own charge rate. a) Draw a star schema diagram for the data b) Starting with the base cuboid [date; spectator; location; game], what specific OLAP operations should perform in order to list the total charge paid by student spectators at GM Place in 2004?

Define Business ethics . Explain its significance.

Short Note on Security Architecture of E-governance