Write the limitation of Apriori algorithm.
Limitations: Apriori algorithm is easy to understand and its' join and Prune steps are easy to implement on large itemsets in large databases. Along with these advantages, it has a number of limitations. These are:
1. a Huge number of candidates: The candidate generation is the inherent cost of the Apriori Algorithms, no matter what implementation technique is applied. It is costly to handle a huge number of candidate sets. For example, if there are 10^4 large 1-itemsets, the Apriori algorithm will need to generate more than 10^7 candidate 2-itemsets. Moreover, for 100 itemsets, it must generate more than 2^100 which is approximately 100 candidates in total.
2. Multiple scans of transaction database so, to mine large data sets for long patterns this algorithm is not a good choice.
3. When the Database is scanned to check C_k, for creating F_k, a large number of transactions will be scanned even they do not contain any k-itemset.
OR,
Limitations of Apriori Algorithm
Apriori Algorithm can be slow. The main limitation is the time required to hold a vast number of candidate sets with many frequent itemsets, low minimum support, or large itemsets i.e. it is not an efficient approach for a large number of datasets. For example, if there are 10^4 from frequent 1- itemsets, it needs to generate more than 10^7 candidates into 2-length which in turn they will be tested and accumulate. Furthermore, to detect a frequent pattern in size 100 i.e. v1, v2… v100, it has to generate 2^100 candidate itemsets that yield costly and wasting of time of candidate generation. So, it will check for many sets from candidate itemsets, also it will scan the database many times repeatedly for finding candidate itemsets. Apriori will be very low and inefficient when memory capacity is limited with the large number of transactions.
OR,
LIMITATIONS OF THE APRIORI ALGORITHM
One of the biggest limitations of the Apriori Algorithm is that it is slow. This is so because of the bare decided by the
A large number of itemsets in the Apriori algorithm dataset.
Low minimum support in the data set for the Apriori algorithm.
The time needed to hold a large number of candidate sets with many frequent itemsets.
Thus it is inefficient when used with large volumes of datasets.
As an example, if we assume there is a frequent-1 itemset with 10^4 from the set. The Apriori algorithm code needs to generate greater than 10^7 candidates with a 2-length which will then be tested and collected as an accumulation. To detect a size frequent pattern of size 100 (having v1, v2… v100) the algorithm generates 2^100 possible itemsets or candidates which is an example of an application of the Apriori algorithm.
Hence, the yield costs escalate and a lot of time is wasted in candidate generation aka time complexity of the Apriori algorithm. Also, in its attempts to improve the Apriori algorithm to check the many candidate itemsets obtained from the many sets, it scans the database many times using expensive resources. This in turn impacts the algorithm when the system memory is insufficient and there are a large number of frequent transactions. That’s why the algorithm becomes inefficient and slow with large database
Comments
Post a Comment