Abstract
|
|
---|---|
Association rule mining is one of the most popular exploratory data mining techniques to discover interesting and previously unknown correlations from datasets. The main goal of association rules algorithms is to find the most frequent set of variables, and then find the correlations between the frequent items. Current algorithms for association rule mining are computationally expensive, especially for very large datasets. Moreover, the large number of discovered frequent itemsets hinders the applications of the algorithms in many real-world datasets. Usually frequent sets with larger length are more interesting and finding the set of maximal length itemsets is useful for many applications. We introduce a novel algorithm, called Width-Sort that efficiently discovers the maximal length frequent itemsets. In Width-Sort, dataset is partitioned based on the transactions lengths to reflects over the additional information hidden in them. Lemmas are developed to estimate an upper bound for the maximal length of the frequent itemsets as well as to prune the items that cannot be part of the maximal length frequent itemsets. The efficiency of the algorithm is tested using both simulated and real-world datasets. | |
International
|
Si |
Congress
|
9th International Conference of the ERCIM. Computational and Methodological Statistics (CMStatistics 2016) |
|
730 |
Place
|
Sevilla (Spain) |
Reviewers
|
Si |
ISBN/ISSN
|
978-9963-2227-1-1 |
|
|
Start Date
|
09/12/2016 |
End Date
|
11/12/2016 |
From page
|
157 |
To page
|
157 |
|
9th International Conference of the ERCIM (European Research Consortium for Informatics and Mathematics) Working Group on Computational and Methodological Statistics (CMStatistics 2016) |